Following the exciting news from Google Next 2018 about the upcoming general availability of Google Cloud Dataprep by Trifacta, we wanted to do something special for our beta users. And there certainly are a lot of them. Since September 2017, with the launch of the open beta version of Google Cloud Dataprep, we’ve been thrilled to see huge adoption around the product reaching over 800 organizations and more than 22,000 users, who run or schedule thousands of jobs daily.
We’d like to thank our beta users for the enthusiastic support and active use of Cloud Dataprep by giving all beta users early access to the new capabilities that we’ve added for general availability (GA). Your feedback has informed us how to best shape the GA product, resulting in major enhancements to existing features and brand new capabilities with rich functionality.
Comprehensive Design Refresh
The first thing beta users will notice is a complete refresh of the entire UI designed to make the product more efficient & accessible for new and experienced users. In addition, you’ll see many small improvements and usability changes throughout the product, though some icon placements and task names have changed, rest assured all your favorite features are still there.
Some key highlights include:
- New home page organized on recent activity
- Expanded onboarding tours
Team-Based Data Preparation
The most business-critical data preparation projects typically involve more than one user. So we’ve added new capabilities that enable team-based data preparation to foster greater collaboration between users. These features include:
- Sharing & Copying Flows: This enables users to share data, wrangling recipes, and entire workflows, allowing a broad range of users to leverage the collective intelligence of the team.
- Real-time Recipe Collaboration: With the ability to collaborate in real-time on recipes, multiple users can quickly arrive at the best iteration.
- Re-use of Custom Samples: In addition to supporting a wider range of techniques including filter-based, stratified, anomaly-based and cohort, users now have flexibility to leverage each other’s samples to understand different slices of the data.
- Built-in Auditability: The ability to trace individual actions within a shared recipe ensures accountability and compliance on team-based data preparation projects.
Data Analyst Productivity
During Cloud Dataprep’s beta phase, feedback ranged from a wide spectrum of users: data scientists, data engineers, and data analysts—basically anyone who needs to use data for their daily jobs. As the beta progressed, we observed an increasing proportion of users accustomed to Google Sheets or Excel, who wanted to transition to Cloud Dataprep to handle specific use cases. To instill confidence that their existing expertise in other tools would easily translate in Cloud Dataprep, we’ve significantly expanded our transformation capabilities with an emphasis on the Data Analyst. These features include:
- Analyst Focused Entry Points & Specialized Transformations: Users familiar with Excel and Google Sheets will recognize the new toolbar shortcuts for popular transformations, such as pivot, join and union.
- Source to Target Schema Matching: Users can leverage built in machine learning models to suggest matches between new data sources and the destination target schema. This makes it easy to conform new data to the specific structure, format and naming needed on output.
- Dynamic Datasets for Self-Service Operationalization: This allows users to operationalize existing recipes against changing data sources with the use of parameters—no need to recreate multiple copies of the same recipe and maintain numerous schedules.
For more details on Google Cloud Dataprep and to sign up for the product, please visit us. If you’d like to learn more about Trifacta and our commitment to best practices for data preparation, download our book “Principles of Data Wrangling: Practical Techniques for Data Preparation.”