Start Free

Speed up your data preparation with Designer Cloud powered by Trifacta

Free Sign Up
All Blog Posts

The Work Before Analysis: Data Prep Made Easy

February 22, 2018

Self-Service Starts Up

Data visualization had traditionally been a task left to the IT department. Since IT staff were the only ones with the keys to the data, it made sense that they’d be the ones to prep the data and use it to create dashboards and visualizations. This system had a few obvious weak points: IT’s time could be better spent on other tasks, and compared to actual analysts, IT often lacks the context to perform true analysis. The rise of self-service BI seemed to solve these problems, putting the creation of visualizations in the hands of the analysts who have the greatest context for the data. However, the impact of this self-service model has only stretched so far.

The Clean Data Bottleneck

Self-service BI has solved one set of problems while creating another. True, analysts no longer necessarily need IT to create their dashboards and visualizations, but in order to get started, they first have to get clean data from IT that meets their requirements. Good visualizations need clean, accurate data that comes in particular formats; datasets almost always have to be transformed before they can be put to good use. This means analysts and IT can get caught in a potentially confusing, error-prone data wrangling loop: the analysts don’t know what the data looks like until they get it from IT, so they transmit requirements that don’t actually line up with what they need; at the same time, IT may not have full understanding of why certain datasets need to be transformed in certain ways. The constant back-and-forth between business analysts and IT creates a limited perspective, introducing glitches that neither side knows to look for. Data quality issues can have a huge impact on visualizationsan unnoticed outlier can skew an important average, or duplicate records could distort the overall quality of the dataset.

Not only is the IT/analyst loop treacherous due to the potential for information loss inherent in all communication between departments, it’s time-consuming. Even in a business with active self-service BI solutions where analysts no longer have to wait for IT to create their visualizations, data wrangling beforehand still eats up 80% of the time an analytics project takes to complete. Multiply that by the increasing number of analysts trying to get work done, and you’ve got a massive bottleneck in your analysis pipeline.

Moving On Up: Self-Service Data Prep

It doesn’t have to be this way; self-service access should be moved further up the data pipeline. Analysts know exactly what kind of data they want; why not let them curate and wrangle it themselves?

Self-service data preparation is the only way to scale analysis, data science, machine learning, etc..  in a data-driven organization, giving analysts the ability to create compelling and accurate analytics without having to wait for IT to deliver clean data. IT’s duties can instead shift towards governance, thereby granting and controlling self-service data access to users or classes of users; as self-service data prep expands throughout a given organization, maintaining data lineage becomes another important function for IT. Everyone gets to concentrate on the work they’re best at without each department having to wait for the other to get back to them; by cutting analysts free of the feedback loop, self-service data prep dramatically slashes project times.

Welcome To Trifacta’s Wheelhouse

According to Dresner Advisory Services and Forrester, Trifacta is the #1 self-service data prep platform, providing an analyst-friendly toolset that cuts the feedback loop between IT and business users. Trifacta also includes the governance and lineage needed for IT to manage Trifacta across thousands of users. Dataset volume isn’t a problem, either;  just ask the German stock exchange. When the Deutsche Börse’s Content Lab needs to wrangle datasets in the 1-1.5 petabyte range, they use Trifacta to present customers with analyses within 24 hours, a shocking reduction from their former 2-3 month turnaround. Trifacta can help any data-driven organization deliver results faster. Put the power of data wrangling back in the hands of analysts where it belongs: to try it out yourself, sign up for Trifacta Wrangler now.