Wrangle Summit 2021 On Demand

You can still experience the best people, ideas and technology in data engineering, all in one place

Get All-Access Pass

The Trifacta Data Wrangling Cheat Sheet

   What data transformations can you realize with the Trifacta data preparation platform? Here is a cheatsheet to quickly identify the transformation you need and what it does.

 There’s a misconception that data preparation a.k.a. data wrangling is for simple data transformation requirements only, and for business users with no technical background. It couldn’t be more wrong. Simplifying something remarkably complex like data transformation does not mean limiting the scope of the possible. In fact, it is the reverse: provide a way to do more and by more people from any horizon. This is what Trifacta has been offering since day one; a common ground for people transforming data today, be it with SQL, ETL, Python code or Excel, to collaborate and communicate together on data transformation needs using the same language. Bridging IT and business together in any data discussion.

To remove this confusion, we are publishing this Data Wrangling Cheatsheet to surface the extent of Trifacta’s data preparation capabilities and help you with learning this process. 

Here is just a shortlist of what Trifacta offers (this is a subset…): more than a hundred formulas to join, combine, union, merge, pivot, unpivot, nest, unest, standardize, normalize, filter, delete, format, split, convert, extract, replace, profile, aggregate, sample, apply regular expressions, feature engineer, structure json, objects, arrays and tabular datasets. And all this, in one super intuitive and fun user interface. This is a lot, but we made it easily digestible with this cheatsheet.

This article covers the essential data preparation concepts captured in the cheatsheet. 

Fig. combining, reshaping, restructuring and applying formulas to rows and columns

Fig. Principal data types and pattern syntax to manipulate data values

About the Cheat Sheet

The Data Wrangling Cheatsheet showcases the 5 transformation categories listed below. Follow the link to navigate in each category details. 

  • Combining datasets – adding rows and columns to an existing dataset
  • Reshaping a dataset by turning rows into columns and vice-versa and aggregating data 
  • Manipulating rows and columns, by merging, splitting, replacing, extracting, filtering data by applying formulas (and converting rows into a header).
  • Restructuring data by turning a JSON, Objects, Arrays into columns and the reverse
  • And we also added a page focusing on data types and the Trifacta pattern syntax 

How to Leverage the Cheatsheet Within Trifacta Product

When in Trifacta, be it cloud.trifacta.com or clouddataprep.com, edit the recipe in the Flow and try one of these approaches:

  1. Click the icon menu in Trifacta that corresponds to the cheatsheet icon (e.g. , )
  2. Click the recipe icon and the button and type the keyword from the cheatsheet to find the corresponding formula.

Fig. combining, reshaping, restructuring and applying formulas to rows and columns

Specific examples will be provided for each transformation in this article, but for a full list of functionality, click on the corresponding icon in the Transformer UI to see the available options.

For example, see all Filter options by clicking on the Filter icon in the Transformer toolbar.