Start Wrangling

Speed up your data preparation with Trifacta

Free Sign Up
Trifacta Ranked #1 in Data Preparation Market Study

Dresner Advisory Services study reviews and ranks 24 vendors

Get the Report
Schedule a Demo

Company

Bringing Data Preparation to
Google Cloud & Beyond

< Back to Blog
 
September 14, 2017

A few months back we announced our collaboration with Google Cloud on the launch of Google Cloud Dataprep at Google Next in San Francisco. Since this launch, the private beta has received massive adoption and rave reviews from Google Cloud customers. We’re excited to bring the product into public beta soon, and shortly thereafter General Availability.

Organizations of all sizes are flocking to cloud solutions, including Google Cloud Platform, to increase flexibility and lower data center costs. As Google expanded their cloud offering to serve customer needs, they observed a bottleneck amongst customers in attempting to analyze diverse datasets  in the cloud. Their customers validated the well-known statistic that over 80% of data analytics time is spent in data preparation. Google realized that adding a self-service data service to their platform was critical for companies’ performing analytics in  the cloud, and thus collaborated with Trifacta to create Google Cloud Dataprep.

Knowing that Trifacta was the leader in data preparation for the cloud, Google Cloud Dataprep selected and integrated Trifacta’s interface and Photon Compute Framework directly into the Google Cloud Platform. Trifacta’s integration in Google Cloud Dataprep means users automatically experience the same great functionality found in Trifacta, including:

  • Predictive transformation: Google Cloud Dataprep automatically detects schema, type, distributions, and missing or mismatched values and uses machine learning to recommend corrective data transformations.
  • Interactive exploration: An intuitive user experience centered around interactive data profiles eliminates the need for coding or SQL queries for data access. Users can spend their time on data analysis instead.
  • Out-of-box integration with Google Cloud Platform: Users can securely access raw data from Google Cloud Storage or BigQuery. Data can be uploaded into Google Cloud Dataprep, cleaned, prepped, and returned or inserted into BigQuery for further analysis.
  • Structuring of unstructured data: Handles JSON, AVRO, Excel, compressed files, nested arrays, etc.
  • Fully-managed infrastructure: Google Cloud Dataprep automatically handles IT resource provisioning and management including usage based billing and quota restrictions

As a bonus, Google Cloud Dataprep also has native integration with Cloud Dataflow, a massively parallel processing engine that Google hosts to ensure efficient processing.

Trifacta—the Power Behind Both Google Cloud Dataprep and Wrangling On-Premise

If your organization is investing  in Google Cloud Platform and interested in leveraging  Google Cloud Dataprep, it’s a natural next step to consider Trifacta for on-prem data wrangling too, for a few specific reasons:

  • Google’s decision to collaborate with Trifacta on Google Cloud Dataprep validates Trifacta’s position as the leading data preparation vendor.
  • The familiarity that users gain when using Google Cloud Dataprep translates to their work using Trifacta for on-prem, given that both use the same interface and functionality.
  • Leveraging the same technology in Google Cloud and on-prem generates consistent transformation logic, metadata and data lineage, while still leveraging the best of breed engine per environment – Spark or Google Cloud Dataflow.  This enables users to easily transition between environments.
  • Trifacta ensures that any future data wrangling needs can be met. No matter what combination of cloud, on-prem, or hybrid strategies an organization chooses to use in the future, Trifacta will interoperate with those computing environments.

You’ve already seen the power of Trifacta within Google Cloud Dataprep so if  you have on-premise data wrangling needs  too, Trifacta is the best choice for a seamless hybrid data preparation environment.   

To learn more about how your organization can leverage Trifacta both in Google Cloud Platform and on-prem, read our brief.