Start Free

Speed up your data preparation with Trifacta

Free Sign Up
Wrangle Summit 2021 On Demand

You can still experience the best people, ideas and technology in data engineering, all in one place

Get All-Access Pass
 
All Templates

Transform Data in Tables to Remove Duplicates

Removing Duplicates Flow The flow view of this template

Remove rows where duplicate values exist in specific columns

Transformations:
aggregate functions (count), rownumber

Trifacta has a deduplicate transformation that allows you to remove rows where the values are identical across all columns. However, what if you want to remove rows where the data is duplicated in only certain columns? This simple template shows you how to find and remove rows when there are duplicate values in some of the columns, but not all columns. To customize the template for your own use, simply update the aggregate group by parameter to include all the columns that you want to check for duplicates

New to Trifacta?

Sign up below to our free 30-day trial to use this template.

SIGN UP FOR FREE TRIAL

Already have an account?

Download template (Trifacta version) and import it on the Flows page.

Is your data on Google Cloud?

  1. Download template (Dataprep version)
  2. Launch Dataprep on Google Cloud
  3. Import it on Flows page

Learn more about Dataprep

How to Import