Start Free

Speed up your data preparation with Designer Cloud powered by Trifacta

Free Sign Up
All Blog Posts

How Data Engineers Have Helped Data Prep Grow Up

November 14, 2018

In recent years, a new term in data has cropped up more and more frequently: DataOps. As an adaptation of the software development methodology DevOps, DataOps refers to the tools, methodology and organizational structures that businesses must adopt to improve the velocity, quality, and reliability of analytics.  

Seems pretty straightforward right? Unfortunately, it’s not.  

There are three main pieces to the DataOps puzzle that any organization must account for: technology, process, and people. The organizations that we’ve rubbed shoulders with through our work at Trifacta understand the balance involved; for successful DataOps, each of these pieces must inform and depend upon the other. Investing in one does little good without consideration for the others.

Data Prep: Technology, process, and people

When it comes to technology, the emergence of cloud and self-service has led to broader “analytics modernization” initiatives, wherein companies are augmenting or replacing existing analytics investments with modern solutions designed for today’s users, computing platforms and governance requirements.

However, successful adoption of the “modern analytics stack” relies on one fundamental process challenge – how do you balance self-service, governance, and scale? An impossible challenge if clearly defined roles & responsibilities are not set.

Then, there are a variety of different people involved in an organization’s analytics processes. IT or data architects manage the technology infrastructure. Data analysts and data scientists work hands-on preparing and analyzing the data. But once an analyst develops something of value, how do the right people across the organization get access to it? How do you make sure the data is accurate and well-governed? How does the entire process gets configured to be scalable and repeatable?

This is where data engineers come in.

How the data engineer has helped data prep grow up

Data engineers are focused on taking the work of end users and operationalizing it for the broader organization’s use. As end users build new data prep workflows and analysis of value, it’s the role of data engineers to manage data prep; the process of scaling, scheduling and governing this work. In a sense, it’s a hand-off between the individual who has the greatest context for the data and the individual who has the greatest context for the organization’s systems, processes, and data governance.

This hand-off between data engineer and knowledge worker has become increasingly critical because the data prep process has transitioned from siloed desktop applications to modern cloud or data lake environments. And it’s why we’ve architected Trifacta to utilize scalable, modern computing platforms whether on-premises or in the cloud. End users have the freedom to work with any type of data regardless of size or shape and data engineers have the appropriate computing environment to manage the governance and operationalization of valuable work developed by their end users through data prep.

In this sense, data engineers have helped data prep grow up.

What once was a siloed activity using excel or desktop apps, data prep is now a repeatable, scalable process that can fuel the broader organization’s DataOps practices in the goal of constantly improving the velocity, quality and reliability of analytics.  

Not only have data engineers helped data prep grow up but they’ve also helped our platform mature. Today, we’re excited to introduce a new range of functionality for data prep in Trifacta specifically designed for data engineers. Check out today’s announcement to learn more about how new features like RapidTarget and Automator improve the ability of data engineers in operationalizing data prep workflows. Also, stay tuned in the coming days and weeks as members of our product management team will dive into the details of each new feature in a series of blog posts.

Happy wrangling!

1“What is DataOps? | Nexla: Scalable Data Operations Platform for the Machine Learning Age”. Retrieved 2017-09-07.

2“Unravel Data Advances Application Performance Management for Big Data”. Database Trends and Applications. 2017-03-10. Retrieved 2017-09-07.