Functions: The Foundation of Excel Data Analysis
The foundation of Excel data analysis is predetermined functions, of which there are hundreds. Over time, analysts performing Excel data analysis come to know commonly-used functions like the back of their hand. Still, it’s a help to reference guides, such as this one, to learn and relearn new Excel shortcuts.
For many analysts, accelerating the process of recalling the name of a function can mean all the difference in completing an Excel data analysis in hours or days. Analysts might use add-in data analysis Excel functions, such as the Analysis Toolpak, to help simplify the functions that they need to learn.
How to Consolidate & Merge Data in Excel
Reference guides and memorization are helpful in completing Excel data analysis at a faster clip, but knowing a wide range of Excel formulas is only half the battle of Excel data analysis. Analysts must be able to accurately assess which function is needed for any given dataset and quickly review that the chosen function has altered the dataset in the correct way.
Let’s take a closer look at two common functions in Excel, “Consolidate” and “Merge,” and how to use those functions in context.
How to Consolidate Data in Excel
For this example, let’s say you are given two sets of data about the amount of loans a group of members have borrowed per year, each in an independent Excel workbook. You want to understand the total amount of loans borrowed by each member.
If both sets of numeric data are already formatted in a similar way, such as prices always formatted as $1.00, you can use the consolidate feature in Excel (under the ‘Data’ dropdown menu).
Open each sheet you plan to use and confirm that the data types you want to consolidate in Excel match.
- In a new empty worksheet, select ‘Consolidate.’
- In the ‘Function’ box, select the function you want to use. In this example, we’re using “Sum” to add together the total loans borrowed per member.
- Under ‘Reference,’ select ‘Browse’ to identify the Excel workbooks you want to consolidate the data from. Add the source(s).
- Important: Make sure the labels match. Then hit ‘OK,’ watch the data propagate, and begin reviewing or analyzing the new sheet.
How to Merge Data from Two Excel Worksheets
Traditionally, VLookup has been one of the most important tools for merging data in Excel, but the process requires multiple steps and can easily tire analysts who must merge multiple columns across many datasets. Instead, let’s take a look at how we can do this same process all within the Excel Power Query editor.
In this example, we’re using two individual datasets, the first containing basic member information, such as income, education, phone number, etc., and the second containing member loan information, such as the loan amount, interest rate, loan status, etc. Each dataset also contains a member ID, which will allow us to join the data on that common field in order to compare all of this data side by side. Here’s how to proceed:
- First, we’ll take a look at each dataset to roughly analyze their contents, and then open up a new worksheet for our merged dataset.
- Next, we’ll click on the “Data” tab of our new worksheet and select “From Text/CSV” because the files that we’re working with are csv files. You can also import Excel files by selecting “Get Data.”
- We’ll start by importing our file on basic member info, which will bring us into the Power Query editor.
- Next, we’ll bring in our other dataset by selecting “New Source” and “Text/CSV” and we’ll see the dataset on loan information added to the left-hand side.
- To merge these two datasets, we’re going to select “Merge Queries” and “Merge as New” so that we can have a separate space for our new dataset.
- We know that the common data field is “member ID” so we’re going to select that column for both datasets.
- Now, we’re going to delete some of the columns that we don’t need from our member information dataset. When we’re finished, we’re going to click on the icon next to “member_loans” to expand the dataset we just brought in.
- Here, we’re going to select the columns that we want from our member loans dataset. We don’t need to bring in member ID, as it’s already a column in our member info dataset.
- Once we’re finished selecting, we can rearrange these columns as it makes sense for our analysis.
- And finally, we’ll hit “Close and Load” to see the finished, merged dataset in our blank worksheet.
Where Excel Data Analytics Breaks Down
Excel data analysis is a rigid process; it demands that you know exactly the correct function and how it should be entered, while leaving little flexibility for altering the final outcome. As data analysts face increasingly complex Excel data analysis problems that encourage tinkering and exploration, this rigidity is beginning to stall progress.
Excel data analysis will continue to be a reliable approach to solving data problems—especially when those problems are relatively simple and involve structured data. However, many data analysts are beginning to expand beyond Excel to embrace new solutions. One such solution is data preparation platforms. Unlike Excel, a data preparation platform offers increased flexibility and guidance about how a dataset should be altered, re-adjusted, and verified for analysis. These platforms are accelerating data preparation time by up to 90% and are especially user-friendly to those accustomed to Excel.
Trifacta: Accelerating the Data Prep Involved in Excel Data Analytics
Trifacta is routinely recognized as the leader in data preparation. Many of its customers have transitioned from Excel to Trifacta in order to both accelerate the process of preparing data and generate more insightful outcomes during analysis.
One such customer is PepsiCo. Under Excel, PepsiCo’s CPFR team was struggling to quickly turn around sales forecasting reports that would determine how much and what quantities of product was needed for their largest retailers. Their huge volumes of inventory data were breaking under Excel, not to mention the dollars wasted in the time they lost or the occasional error that slipped through. After using Trifacta, the PepsiCo CPFR team saw tremendous results—reporting time has been reduced by 70% and build time has been reduced as much as 90%. To learn more about why PepsiCo’s use case was a better fit for Trifacta than Excel, click here.
The Trifacta data preparation platform presents representations of data in the most compelling visual profile, and simply selecting certain elements of the profile immediately prompts intelligent transformation suggestions. We’d love to chat with you about your use case to see if Trifacta is a better fit than Excel. Schedule a demo of Trifacta today.