Class is Now in Session

Presenting The Data School, an educational video series for people who work with data

Learn More

Data Quality Rules – Reporting dirty data to turn it into pristine insights

September 1, 2020

Chefs can do magical things with food. They can take a slate of ingredients and weave them together to create dishes that will hit every single taste bud that you have. They can put a smile on your face, set your mouth on fire with incredible spices, or hit that sweet tooth that you yearn to satisfy.

In the end, though, chefs are only as good as the ingredients they work with.

The same is true for your analytics and predictive models; you may be an analytics chef, but your analytics are only as good as the data you leverage. Unfortunately, one rotten tomato can badly screw up your insight outcomes and lead to bad decisions or predictions. With the latest release of Trifacta, we’re introducing Data Quality Rules, which prevent dirty data from contaminating your data preparation recipes.

What is it?

Data Quality Rules allow the user to determine whether the current data is fit for use and, if not, what additional transformations are needed. Data Quality Rules assess – thanks to predictive suggestions – the data set and provide a list of indicators to monitor and track the data’s cleanliness over time.

Why is this feature important?

As you know by now, the quality of an analytical or ML/AI predictive outcome is only as good as the data that feeds its logic. Data Quality Rules provide an automated way to identify data flaws and build quality indicators to monitor its remediation. The state of your Data Quality Rules is automatically updated to reflect changes, and they can be used to prevent any undesired transformation over time. If you delete columns or other elements referenced in the Data Quality Rules, errors are generated in the Transformer page.

Ultimately, the rules can monitor the accuracy, completeness, consistency, validity, and uniqueness of the data you leverage in your analytics initiative and ensure you have a comprehensive view of the cleanliness of the data.

How does it work?

A new icon has been added to the transformation grid:

When you click the “View suggestion” button, Trifacta automatically suggests a series of Data Quality Rules to validate various aspects of the data’s quality. For example, is the value unique or empty, does it fit a pattern, is it in an expected range, or does it correlate to another column?

From there, you can accept, remove, edit, or add to the Data Quality Rules to ensure they are fit for your particular use-case for this data.

You can add your own rules by leveraging the power of the Trifacta Wrangler language to build any validation rule you may have in mind.

What else?

Data Quality Rules are another step forward in our Adaptive Data Quality strategy, which adapts in accordance with your specific and personalized requirements. You can learn more about Trifacta’s Data Quality Vision, and particularly Adaptive Data Quality, by reading the blog from Jeff Heer, Trifacta’s Co-Founder and Chief Experience Officer.

Haven’t had a chance to try Trifacta yet? START FREE today!

Related Posts

December ’19 Wrangler Release – Rapid Target Fuzzy Match, UI Improvements, Downloadable Profiles

The December ‘19 Wrangler release brings several new and exciting features to Trifacta’s free product.... more

  |  December 18, 2019

Google Launches Public Beta of Cloud Dataprep, Built in Collaboration With Trifacta

Google recently announced that Google Cloud Dataprep—the new managed data wrangling service developed in... more

  |  September 21, 2017

Self-Service Analytics on BigQuery in 30 minutes

Learn how to leverage BigQuery, Cloud DataPrep, and Data Studio to turn your raw data into a beautiful report... more

  |  July 25, 2019