Clean data is essential to trustworthy analysis. No one wants to see errors in their analysis—or worse, have errors go undetected and incorrectly steer business decisions.
The vast majority of business analysts have learned how to clean data in Excel. But given the huge number of Excel data cleaning techniques, this is usually an on-going process. To help analysts get their start (or refresh some old tricks), we’ve selected three of the most common ways for how to clean data in Excel below:
Remove extra spaces
To get rid of those difficult-to-spot extra spaces, analysts can use the TRIM function, which removes leading and trailing spaces as well as the additional spaces between words. This isn’t just for aesthetics—removing the spaces in Excel is important in order to be able to surface those values through functions such as VLOOKUP. Here’s how to do it :
- The TRIM function can be used manually, but it’s possible to wrap the TRIM function in context with the VLOOKUP function. This allows users to search and correct all added spaces while finding the matches they were searching for.
- Your VLOOKUP formula would then consist of the following:
- TRIM + the lookup value, stripped of extra spaces
- The lookup range
- The column that has the value to return
- 0 (for exact match)
Find and replace text
Addressing blank cells is another common objective when cleaning data in Excel. If not addressed, these many read as errors or will not be included in your end analysis if they were originally intended to signify “0.”
- To select all blank cells, users need only to open the Go To dialogue box, select “Special” and then select “Blanks,” or all of the blanks in your cell.
- From there, it’s easy to enter a uniform value, such as “0” or “N/A,” by pressing Control and Enter.
Distribute the contents of a cell into adjacent columns
When analysts first start learning how to clean data in Excel, this is another early question that arises. Analysts don’t normally have absolute control as to how data is recorded, which means that too much data can be recorded in one cell. For example, sensor data arrives with both the date and time combined in a single value, when it is often more helpful to separate the two values into two columns. Here’s how to do it:
- Select the cell, range, or entire column that contains the text values that you want to split.
- On the “Data” tab, in the “Data Tools” group, click “Text to Columns.”
- Follow the instructions in the “Convert Text to Columns Wizard” to specify how you want to divide the text into separate columns.
The challenges of data cleaning in Excel
The experience of cleaning data in Excel is reliable and familiar; analysts have been successfully cleaning up data in Excel for decades in order to ready it for analysis.
However, as today’s data grows larger and increasingly complex, Excel is showing its age. Cleaning data works best when analysts are relatively familiar with the data at hand; they know what to search for and more or less what types of errors need addressing. If not, it’s hard to identify invalid numbers or incorrect dates, especially when that data respects certain format standards. And when it comes to assessing data inconsistency between multiple spreadsheets, cleaning data in Excel becomes a complex project all by its own.
Instead of learning how to clean data in Excel, analysts are increasingly using new data preparation solutions that intelligently identify possible transformations and visually surface errors. These types of solutions mirror the direction of the analytics world—adopting platforms that are user-friendly, flexible and allow for visual interaction.
Designer Cloud offers an alternative to cleaning data in Excel
While using Excel to clean data will always be a reliable solution for simple datasets, new data preparation solutions offer an intelligent approach to tackling larger and more unfamiliar data.
Alteryx Designer Cloud, the leader in data preparation, allows users to interactively explore their datasets by clicking on components of the data so that they can discover new and essential ways in which it must be transformed. Instead of a process driven by careful steps like data cleaning in Excel, Designer Cloud will automatically present suggested transformations for users to evaluate and edit. With Designer Cloud, not only are users producing more thoroughly cleansed datasets, but are also achieving it in significantly less time.
To learn more about how Designer Cloud works and if it’s a better alternative for your use case, schedule a free demo today.