Trifacta’s June ‘19 Release brings two great new features, Transform by Example, and Enhanced Recipe Interactions. Transform by Example is an exciting new innovation from Trifacta where users can enter an example of how they’d like their data to look, and Trifacta leverages machine learning behind the scenes to create a step to get to that output. This is another step towards making the data preparation process more intuitive, since it requires no prior knowledge of syntax or step creation. Enhanced Recipe Interactions allow multi-step copying and pasting, easy step moving, and other improvements when working with the recipe directly.
- Transform by Example
- Enhanced Recipe Interactions
Transform by Example
This feature expands the native, guided step creation in Trifacta. For any existing column, you can type out the desired output value of that column, and Trifacta will build a program in the background to get you there. We’ll show a couple of different scenarios where this might be useful, but there are many more scenarios where this feature will be an extremely powerful part of your toolkit.
One of the most common tasks analysts need to perform is pattern reformatting – converting multiple formats of data into a single format, by manipulating delimiters, tokens, and word lengths, while preserving semantic content. For example, suppose you have a column of phone numbers that you’d like to reformat into the common +1 ### ### #### US format.
Visualizing phone number data
Writing out data transformations to solve this task can be a time consuming and error prone process, especially because the data may have many different formats of phone numbers, as demonstrated above by the Patterns interface. Moreover, Trifacta’s intelligent suggestions may not always apply to the data types you’re trying to manipulate, especially when you’d like to create a new format not already present in your data (in this example, adding the country code +1).
On the other hand, you know exactly what you’d like your data to look like. For example, given the first record of the input column, “236.926.9604”, you know that you want it to look like “+1 236 926 9604”. Wouldn’t it be nice if you could simply provide this knowledge of the end result to Trifacta, and have it figure out the rest?
This is exactly the objective of Transform by Example. Rather than authoring transforms, you instead type out one or more examples of what you’d like your output records to look like, and Trifacta will create the transform to get you there.
Typing out an example
After entering the example on the first row, Trifacta infers exactly the kind of transformation you’re trying to do. It applies this transformation to your input column, and provides you with a preview of what your data will look like once committed. If you’re not satisfied with what Trifacta predicts, you can simply add more examples for different input records until you’re happy with the results. Finally, you can add the transformation as a step to your recipe, which can eventually be executed at scale on your full dataset.
Let’s take a look at another example. This start_date column above is not in the format we need for our downstream analysis. Additionally, there’s a data quality issue of having multiple different formats present here. We can tackle both of these issues easily using Transform by Example.
Formatting heterogenous dates by example
We entered a format that doesn’t exist in the data, and Trifacta not only updates the primary data format present in our column yyyy/mm/dd, but also updates the secondary date format short-month dd yyyy simultaneously.
Enhanced Recipe Interaction
You can now perform recipe operations on multiple steps. Operations like multi-step disabling, moving, copying, and duplicating streamline the interactions within recipes and simplify recipe organization especially as recipes get larger and more complicated.
Tip: You can also copy and paste multiple steps from one recipe to another, which enables reuse of recipe parts.
To try these new features for yourself, sign up for free at trifacta.com/start-wrangling