Class is Now in Session

Presenting The Data School, an educational video series for people who work with data

Learn More

Cleanse Salesforce Address Data using Cloud Dataprep by Trifacta

October 1, 2020

Introduction

We’ve all been there. Imagine having an upcoming event where sponsors are to receive a personalized invitation in the mail for your event. Before any mail can be sent, the sponsors must first be available in Salesforce for event registration and post-event surveys. These sponsors don’t already exist in Salesforce so we must upload a list of new Accounts. This list does contain an address but is not in the correct format. These Accounts get loaded into Salesforce with all of the Address info mapped to the Billing Street field. How can the data be corrected to map to their separate City, State, and Zip values? Please follow along …

In this lesson, our challenge will be to:

  • Import Accounts into Salesforce
  • Access Dataprep (Trifacta)
  • Setup the Salesforce Connector in Trifacta
  • Select Accounts to Import into Trifacta
  • Setup a Flow to manage Account Records
  • Within a Flow find Incorrect Address Information
  • Parse Address into Separate Columns
  • Run the Job and Download Corrected File
  • Reimport the Corrected Accounts back into Salesforce

Prerequisites

Step 1: Data Import of Accounts into Salesforce (Bad Data)

First, here’s an example of the bad data we’ll be loading into Salesforce, on purpose. Don’t worry, the goal is to use Trifacta to correct it. Here are the steps to import

Steps:

  • Log in to Salesforce as System Administrator
  • Select the Setup Icon and choose Setup

  • On the Setup home page, type “import” into the Quick Find box
  • Select Integrations > Data Import Wizard

  • On the Data Import Wizard, scroll down to the “Import your data in 3 easy steps” section
  • Select Launch Wizard
  • On the Launch Wizard page, provide the following:

  • What kind of data are you importing: Standard objects > Accounts and Contacts
  • What do you want to do? Add new records
  • Where is your data loaded? Upload the CSV containing the Accounts

  • On the Edit Field Mapping, notice that the Address isn’t Automatically Mapped. Choose the Map link and select Billing Street

Step 2: Enable Salesforce Connector in Trifacta

Once inside Trifacta, we will need to connect Salesforce to allow users to access Salesforce data to be used in Flows and Data Wrangling.

Steps

  • From the homepage, select the Import Data button

  • On the Import Data page, select the New button

  • On the Create Connection page, select Salesforce (import only)

  • On the Create Connect (Salesforce) page, provide the following
  • Server Name = login.salesforce.com
  • User Name = your (admin) username
  • Password = your (admin) password
  • Security Token generated in account = the token of the username provided
  • Select Test Connection to verify the connection
  • Once done, select Create

Step 3: Import Salesforce Account data into Trifacta for Analysis

Once the Salesforce Connector has been made active, the connector presents options for accessible data elements within Salesforce. We’ll proceed to find the Account object to use and make a dataset

Steps

  • From the Salesforce Connector, see the available objects
  • From the list, select SFORCE to see Standard Objects

  • From the SFORCE list, select Account (you’ll see Account listed as a DataSet)

  • Select the Import button to complete Dataset creation

Step 4: Use the Account Dataset in a Flow to Wrangle and Analyze data

We will now take the recently created Account dataset, find the incorrect records, and correct them using a Flow to route the dataset and Recipe to manipulate the data. Let’s get started

Steps:

  • From the Dataprep home page, select Create Flow

 

  • On the Create Flow prompt, provide Flow Name and Description
  • Select the Create button

  • On the Flow page, select Add Datasets

  • Select the Salesforce: Account dataset from the previous step

  • Once the dataset has been selected, select Add new Recipe to manipulate the data

  • Select Edit Recipe to adjust the Account dataset

  • On the Recipe page, view all of the Account data from Salesforce
  • View for the Bad Address accounts (highlighted)

  • Note: For easier management, since this Dataset will contain all fields on the Account object, choose to only display the fields you need. ID, NAME, BILLINGSTREET and hide all others. You can do this using the “eyeball” icon in the lower left-hand corner

  • On the Visible columns window, Select the “eyeball” icon to hide fields that aren’t needed and select the [X] when done

  • Now we’re presented with a clean dataset to manage

  • Follow the next steps to select sections of the current BILLING STREET to separate into CITY, STATE and ZIP columns
  • Start by selecting the Street section of the BILLING STREET value, Dataprep will detect the pattern, create a new column, and make a series of suggestions based on the data and which action to take. Choose the first “Extract values matching” and perform the same task for the rest of the address information contained in BILLING STREET (video below)


  • Once fields have been separated, view how Dataprep has organized each step

  • Proceed to select each column to rename to their new column headers; example: BILLINGPOSTALCODE_REV, BILLINGCITY_REV, BILLINGSTATE_REV, BILLINGSTREET_REV

  • Once renaming is complete, proceed to Run Job to produce a CSV output

Step 5: Run the Job to produce a CSV Export of Cleansed Account data

Within this step, we want to make sure to update the headers in the output so we can have a reference for mapping the data back into Salesforce

Steps

  • On the Run Job page, select the Edit button next to the Create-CSV publish action

  • On the Edit page, scroll to the right pane and select More options
  • Check the “Include headers as first row on creation”
  • Select Update

 

  • Confirm that the Settings section, confirm that “with headers” is included
  • Select the Run Job button

  • Now, you’ll be redirected back to the Flow section to see the status of the running job

  • Once done, the Job status will be marked as Complete to signal that it’s ready to download

 

Step 6: Download the Corrected Account File and Reimport into Salesforce

We’ll now download the file from Dataprep to our desktop, clean it up and reimport back into Salesforce. Here are the steps

Steps:

  • On the Job page, scroll to the Publish section and select View All
  • On the File page, select the hyperlink to download

  • Once on the desktop, open the CSV file to view the output

  • Proceed to clean up the file to only include the desired columns for import

  • Repeat the steps from Step 1 to Update the Accounts in Salesforce
  • On the Launch Wizard page, provide the following:
  • What kind of data are you importing: Standard objects > Accounts and Contacts
  • What do you want to do? Update existing records. Match Contact/Account by Salesforce.com ID.
  • Very Important: Be sure to check “Update existing Account information”
  • Where is your data loaded? Upload the CSV containing the Corrected Accounts

  • On the Edit Field Mapping, proceed to update the Address Fields
  • Select the Start Import button
  • Once complete, view an existing Account to see the Address fields properly updated

What’s Next?

From a Marketing perspective, having the capability to identify and correct CRM data is essential to downstream campaigns. From a Salesforce Administrator view, having access to back-end data finally uncovers the opportunity to better manage Salesforce data for end-users. Dataprep by Trifacta presents capabilities for finding hidden opportunities on your data. Let’s keep going …

Related Posts

Unveiling New Cloud Data Warehouse Use Cases for Cloud Dataprep at Google Next ‘20

The biggest news for all things Google Cloud can be found at the Google Cloud Next conferences in the US and... more

  |  September 17, 2020

Crowdsourcing Macros with the Trifacta Wrangle Exchange

We have a very exciting update to Macros live in the new release of Trifacta. In July of this year, we... more

  |  November 21, 2019

The Data School with Professor Joe Hellerstein: The Role of AI in Data Prep

 What is AI’s role in the data preparation process? It doesn’t take much more than asking Siri to... more

  |  May 19, 2020