Imagine: what if non-technical analysts could more easily transform data in a way that doesn’t feel like writing a script or query? What if a technical user could accomplish their complex data wrangling task in 20 seconds instead of 5 minutes due to the help of smart algorithms and good user experience? How can a single interactive data product cater to users with varied needs and skills? These are the questions the design team at Trifacta has been answering during the development of Trifacta v3.
Our goal: empower analysts by enabling them to do the data preparation work they’ve never been able to do while simultaneously accelerating the workflow of technical data scientists. Data wrangling is a difficult problem faced by users with varying levels of technical skills – at Trifacta we’re focused on creating the best data platform for all types of users. In Trifacta v3, we’ve done this by focusing on productivity through user experience, intelligence and machine learning, and connectivity.
Productivity Through User Experience Enhancements
For Trifacta v3, we’ve enhanced our core data transformation experience by designing a flexible new transform creating and editing system for the entire range of Trifacta users, providing non-technical analysts and technical data scientists and programmers optionality in how they would like to create transforms.
The transformation creating and editing system is comprised of new visual transform suggestion cards that make browsing, understanding, and selecting suggested transforms easier, a transform builder that helps users build transforms with an intuitive interface, and robust script editing to allow for easy revision and iteration.
The card represents the transform with a mini preview that allows users to quickly understand the suggested transforms. The mini preview abstracts away the complexity of the transformation syntax and represents the transform visually as a source and preview. The user can browse, compare, and select transforms even if they don’t fully understand the Wrangle language.
Predictive transformation with suggestion cards makes crafting transformations much easier, but there will be cases when users want to tweak a suggestion or craft their own transforms from scratch. The transform builder allows users to do just that – modify or create transforms by using simple drop downs and intelligent inputs rather than writing Wrangle. The builder provides a template for each transform, surfacing required and optional parameters, and assisting users in creating complex functions and patterns.
Our transform editor, with powerful type ahead and syntax highlighting, is still available for more experienced or technical users.
The suggestion cards, builder, and transform editor work together to create a system that helps users learn how to use Trifacta while continuing to be productive. As users begin with the suggestion cards they’re able to recognize how usage patterns and properties of the data generate certain suggested transforms. This teaches them the basic principles of Wrangle. This basic understanding enables users to begin using the builder to create more complex transformations as their familiarity with the language deepens. This process of learning happens as a natural part of a productive data transformation workflow.
We’re not only introducing new features to create transforms, but also to edit transforms. With Trifacta v3, we’re adding script editing, which allows users to edit, delete, and create script steps anywhere in the existing script. The process of data transformation is not a linear process. As the user transforms data, they begin to understand the data, forming new hypotheses and goals, and iterating to the right solution. In this way, editing is not only necessary for users to correct inevitable mistakes, but also to support an iterative process without penalty for hypothesis testing.
By giving our users choice over how they craft and edit their transforms, we’ve made data transformation easier, faster, and more accessible.
Intelligence and Machine Learning
In addition to productivity enhancements through user experience development, we’ve invested in intelligence and machine learning in the product.
We’ve extended our core Predictive Transformation system to support the selection of columns to drive suggested transforms. For example, if a user is analyzing product usage data and selects a string column of product codes and a numeric column of purchase prices, Trifacta will suggest common multi-column transforms such as aggregating the total purchase price by product code or deriving the minimum purchase price by product code for each line item. No pivot table needed!
We’ve also introduced a new multi-split transform that can split data with complex, irregular structures such as logs into a clean table in one automated suggested transform. Trifacta will detect if the transform is appropriate when first creating a new dataset and automatically apply the transform, drastically simplifying the menial task of the initial manual structuring of a data set.
Expanding on our existing join key suggestions, Trifacta v3 introduces new intelligent multi-dataset functionality. When working with multiple datasets, Trifacta will now suggest the best way to union or join multiple datasets together. Intelligent multi-dataset functionality will continue to be an area of focus for us – expect more from us in this area in the future.
For the modern data analyst, data exists everywhere: in hadoop, relational databases, the cloud, and on the desktop. Analysts should be able to be productive with their data regardless of where it lives. To enable productivity with data that lives in multiple places, Trifacta has built support for connecting to enterprise sources such as Hive, S3, and relational sources through JDBC.
Trifacta v3 marks new milestones in productivity through user experience, intelligence and machine learning, and connectivity. Together, these improvements support the entire range of our users, empowering analysts and making data scientists more efficient.
For a comprehensive overview and live demo of Trifacta v3 and the features discussed here, join Wei Zheng, Sean Ma, and I for a live webinar on October 14th. Those of you attending Strata + Hadoop World in NYC can learn more about Trifacta v3 by stopping by our booth or one of our three sessions at the conference.