Trifacta wins the Best Data-Driven SaaS Product award at the 2021 Annual Cloud & SaaS Awards

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
Summer of SQL

A Q&A Series with Joe Hellerstein

See why SQL is Back
 
All Templates

Transform Data in Tables to Remove Duplicates

Removing Duplicates Flow The flow view of this template

Remove rows where duplicate values exist in specific columns

Transformations:
aggregate functions (count), rownumber

Trifacta has a deduplicate transformation that allows you to remove rows where the values are identical across all columns. However, what if you want to remove rows where the data is duplicated in only certain columns? This simple template shows you how to find and remove rows when there are duplicate values in some of the columns, but not all columns. To customize the template for your own use, simply update the aggregate group by parameter to include all the columns that you want to check for duplicates

USE IN TRIFACTA USE IN DATAPREP

New user?

Use the buttons above and start your 30-day free trial. If your data is mostly on Google Cloud Platform, please use Dataprep. Otherwise, choose Trifacta.

Learn more about Dataprep