Start Wrangling

Speed up your data preparation with Trifacta

Free Sign Up
Trifacta Ranked #1 in Data Preparation Market Study

Dresner Advisory Services study reviews and ranks 24 vendors

Get the Report
Schedule a Demo

Five Tips for Optimizing Self-Service Analytics on Google Cloud Platform: Part One

November 12, 2019

So you’ve decided to transition (at least in part) your data analytics to the cloud. More specifically, you’re adopting Google Cloud Platform (GCP), one of the “big three” cloud providers. Now what? Here at Trifacta, we’ve worked with hundreds of customers in this position and have learned a thing or two about how to get up and running with self-service analytics on GCP as successfully as possible. The following five-part blog series is by no means a definitive list, but from our perspective, these tips should be top of mind for optimizing self-service analytics on GCP. 

To kick this series off, let’s start with the basics. Why self-service? In short, self-service answers a problem that has grown bigger and messier as the amount of data that organizations collect has grown bigger and messier. This new data promised analytic innovation, but more often than not, the majority of it went unused. Even now, Inc. reported that up to 73 percent of company data still goes unused for analytics. In other words, increased data didn’t coincide with increased data accessibility. 

The culprit of inaccessible data isn’t IT professionals. Quite the opposite; since the big data boon, the majority of IT teams have been working overtime to meet the demands of the business. But by using data warehouse and business intelligence solutions that were designed for structured data—not today’s unstructured and complex data—and that were designed for a smaller amount of data used among a smaller number of people, it’s an uphill battle. Despite their best efforts, IT professionals were still turning around data requirements late. And even if they were delivered in a timely manner, the results often prompted more questions about what could be added or reconfigured, which meant that IT teams had to undergo the process all over again. 

Given the situation, it’s no surprise the next leap forward in improving information agility is about rethinking who does this work. Self-service says that business users with the best context of the data should be the ones in control of it. And it appears as though the large majority of organizations agree. This year, self-service analytics and BI users will have produced more analysis than data scientists, reports Gartner. Self-service is on the rise, and GCP offers an exciting environment in which to execute it. 

Lesson One: Scrap the Cloud Lift-and-Shift

For lesson one, we’re starting at the beginning—moving to the cloud. And while this particular series focuses on GCP, it’s helpful advice for anyone moving to the cloud. 

When moving analytics to the cloud, many organizations take a lift-and-shift approach—analytics applications are installed on a virtual machine in the cloud. While this approach makes infrastructure maintenance easier, it doesn’t improve the way analytics are delivered, boost their business benefits, or reduce their operational costs. The process and tooling stay the same, and users see no real added value when it comes to self-service.

To be truly self-service, analytics solutions must be cloud-native. Processes for ingesting, storing, wrangling, and reporting on data must be designed to natively integrate with systems built exclusively for the cloud. They must support cloud environments that are dynamic, elastic, scalable, and increasingly containerized and oriented toward providing microservices. Rather than simply recreate the monolithic data platforms they had on premises, successful organizations embrace cloud agility and use open-source systems, such as Kubernetes or Docker, to orchestrate containers and their data flows. 

GCP is attractive for self-service analytics because it provides a native serverless smart analytics suite. Each component is easily activated, dynamically scales or reduces resource allocation based on usage. Maintenance operations don’t have to be planned. With the freedom to leverage any analytics components, use resources flexibly, and control costs, organizations can focus on the data and the value it provides to their business. 

The Next Step Toward Self-Service 

Understanding that your analytics solutions must be cloud-native is a huge piece of the puzzle. It provides the right foundation for your organization’s self-service analytics on GCP. But that is, of course, only the beginning. To get the full list of our tips for successful self-service analytics on GCP right now, download our eBook, “Self-Service Analytics on the Google Cloud Platform: Five Data Preparation Lessons Learned to Ensure Success.” And stay tuned as we unpack more of these learnings on the Trifacta blog. 

Related Posts

Trifacta for Snowflake: Data Prep for your Cloud Data Warehouse & Data Lake – Part 3

In Part 3 of this blog series, we will be looking at how Trifacta helps improve accuracy, speed, and ease... more

  |  September 30, 2019

The Dos and Don’ts of Big Data Success

As a technology that bridges data storage/processing platforms with visualization tools, our team at Trifacta... more

  |  May 18, 2016

June ‘19 Release — Transform by Example and Enhanced Recipe Interactions

Trifacta’s June ‘19 Release brings two great new features, Transform by Example, and Enhanced Recipe... more

  |  June 24, 2019