So you’ve decided to transition (at least in part) your data analytics to the cloud. More specifically, you are adopting Google Cloud Platform (GCP), one of the “big three” cloud providers. Now what? Here at Trifacta, we’ve worked with hundreds of customers in this position and have learned a thing or two about how to get up and running with self-service analytics on GCP as successfully as possible. The following five-part blog series is by no means a definitive list, but from our perspective, these tips should be top of mind for optimizing self-service analytics on GCP.
In part three of our series, we explained why technology (and technology alone) isn’t enough for successful self-service analytics. There must exist a balance between the self-sufficiency these technologies enable and a necessary control over data governance and compliance. In other words, the role of the IT organization isn’t going anywhere. One could argue that IT organizations are actually more important in the age of self-service because they are not only amplifying analytics, but also governing data operations.
Given this necessary balance between business users and the IT organization, let’s dig a little deeper into some of the key stakeholders for self-service analytics and their roles and responsibilities. Specifically, we’re going to continue discussing the roles and responsibilities for self-service data preparation (using Cloud Dataprep by Trifacta), which we’ve reviewed earlier as the first step toward self-service analytics. Having the right people involved in self-service data preparation results in a domino effect toward well-run analytics—if your organization understands how each person should be involved with assessing and preparing data, it’s typically clear who should then be analyzing it.
We’ve listed the three most common job titles below, though we understand that there will be variation in every organization. Some organizations may not have one of these titles among their employees; some will have many other types of titles. The important thing is to consider the function of each title that we’ve listed below, and who covers that role in your organization (or via consultants) regardless of their exact title. One guiding principle that all organizations should consider when reviewing the roles and responsibilities for self-service data preparation is that there should be a flexible and agile flow, not a rigid data pipeline process. Among these roles, there should be ample room for communication and collaboration under a common framework.
Lesson 4: Review the Roles and Responsibilities
Data analysts, in addition to business analysts, project managers, and related positions, are typically in close contact with business users. They often explore raw data to discover what may be useful to answering business questions. Their goal is to get answers as quickly and easily as possible.
Cloud Dataprep by Trifacta is ideal for data analysts because it’s easy to implement and doesn’t require significant technical expertise in coding. Data analysts can be involved early in the data preparation process, exploring and prototyping data to fit their business needs. When automation is required for sustainability and repeatability, they can collaborate with data engineers to orchestrate the end-to-end data pipeline.
Data engineers design, build, and manage data processing and data architecture to support analytics and data science. They’re closely involved in transforming data, including the exploration and profiling of raw data. Chief among data engineers’ goals is streamlining and automating data-related processes so they can manage more of them.
Cloud Dataprep by Trifacta is ideal for data engineers because they can operationalize and monitor the various data flows they or data analysts design. It makes it easy for data engineers to collaborate with all stakeholders to understand data infrastructure requirements and provide guidance to users to improve how they explore, analyze, model, and consume data.
Data scientists apply specialized knowledge and skills to design and model algorithms by leveraging machine learning and artificial intelligence. But up to 80 percent of their time is consumed with routine data preparation tasks, leaving little time for innovation.
Cloud Dataprep by Trifacta is ideal for data scientists because it’s a cloud-native data preparation solution. It simplifies routine data preparation tasks, allowing them to be delegated to more readily available and less expensive resources.
The Next Step Toward Self-Service
We’ve covered a lot in this series thus far, and we’ll wrap it up with our final post next week on the overall importance of data quality. In the meantime, you can get the full list of our tips for successful self-service analytics on GCP right now, you can download our eBook, “Self-Service Analytics on the Google Cloud Platform: Five Data Preparation Lessons Learned to Ensure Success.”