See How Data Engineering Gets Done on Our Do-It-Yourself Data Webcast Series

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
All Templates

Transform Data in Tables to Remove Duplicates

Removing Duplicates Flow The flow view of this template

Remove rows where duplicate values exist in specific columns

aggregate functions (count), rownumber

Trifacta has a deduplicate transformation that allows you to remove rows where the values are identical across all columns. However, what if you want to remove rows where the data is duplicated in only certain columns? This simple template shows you how to find and remove rows when there are duplicate values in some of the columns, but not all columns. To customize the template for your own use, simply update the aggregate group by parameter to include all the columns that you want to check for duplicates


New user?

Use the buttons above and start your 30-day free trial. If your data is mostly on Google Cloud Platform, please use Dataprep. Otherwise, choose Trifacta.

Learn more about Dataprep