Today, we’re excited to announce Trifacta v3, the most significant release in our company’s history. You can read the official press release here; it explains the benefits Trifacta v3 will bring not only to end users, but also to the IT departments managing these environments at the growing number of enterprises investing in Hadoop to gain a competitive advantage.
In the coming days, several members of my team will be using this blog to go deeper into the technical details of v3’s inner workings. As the head of Product at Trifacta, I thought it would be good for me to talk a bit about the “big picture thinking” behind our latest release.
Actually, it’s a pretty simple story: As more and more people adopt data wrangling solutions, the product has to keep up with the needs of this expanding and maturing customer base. Trifacta v3 does this, in two important and inter-related ways: Ease of Use, and Ease of Governance.
Ease of Use
As the need for data wrangling continues to attract new users, we discovered that people indeed have quite different skill sets when it comes to working with data. Despite industry’s best attempt at coining monikers such as “Data Scientist”; there’s no singular job title or descriptor that can fully capture the diversity of experience and background people have with data tools. If the first generation fans of data wrangling were most often users proficient in scripting languages like Python, today they might well be analysts who have spent their entire careers working with Excel.
As we’ve monitored how our customers use the product and have brought a growing diversity of users into our User Experience studies to watch them work, we’ve realized that there’s such a wide spectrum for how each customer approached and used the product. Some users prefer a completely script and code-free environment, exclusively using transform suggestions and previews to drive their data wrangling tasks; others jumped directly into the Transform Editor box and started using our wrangle language type-ahead and suggested drop downs to complete their work. Through monitoring this variety of users, we recognized the importance of optionality within the user interface, so in v3, we added a brand new user experience around transform specifications, allowing users to choose different options of interacting with the data and building transformations.
Our goal is to make Trifacta compatible with a broad set of users with a wide range of skills, rather than impose a single user experience on all of them, one that will inevitably be either too complicated or too simplified for half of the group.
While Phil Vander Broek (Trifacta UX Manager) will be describing these new features in detail in his upcoming blog post, the main point to remember about them is that they’re designed to ease a user into the data wrangling process, especially when working with big, diverse data in Hadoop. There’s a spectrum or continuation in the user experience that makes both novice and advanced users feel at ease when wrangling data. The product makes it possible to seamlessly transition between these options.
Ease of use also means ease of access, in addition to an innovative user experience for wrangling data, v3 also focuses on providing users with a wider range of connectivity options: from native Excel spreadsheets to data residing in Amazon Web Services S3, all relevant data can be accessed and wrangled within Trifacta.
Ease of Governance
Whenever there is an increase in maturity of a piece of enterprise technology, something clearly happening with Hadoop, a number of organizational and IT management concerns get raised. And rightly so. Hadoop data lakes contain some of an enterprise’s most sensitive data, and so naturally, the organization is going to want reassurances on a number of pivotal governance questions.
Such as, Is the data secure? Is there an “audit trail” that shows when and how data might have been changed, and who changed it? How do we know that only the right people have access to the data? How easy is it for someone to replicate a piece of analysis?
These concerns are especially relevant at Fortune 1000 companies, that have significant compliance issues to deal with, and that face severe penalties if data, especially consumer, medical and financial data, is not handled in the most secure manner possible. While these governance concerns are important, they shouldn’t be the barriers to end user productivity. Instead of trying to find new ways to “lock down” users from getting to the data they need, governance succeeds when user and application adoptions flourish.
Fortunately, the Hadoop ecosystem has recognized the importance of this issue, and has responded by building security frameworks into the software. As a result, we at Trifacta had a decision to make. Do we design our own security system, in the process imposing yet another administrative burden on the enterprises using our wrangling solution? Or do we leverage the frameworks that have already gained traction in the marketplace?
We chose the latter approach. And so with Trifacta v3, we are announcing compatibility with a number of established security and metadata frameworks, including Kerberos, Apache Sentry, Apache Ranger and Cloudera Navigator among others. In his upcoming blog post, Sean Ma (Trifacta Director of Product Management) will be going into detail on how v3 solves crucial Hadoop-related data governance issues.
By closely integrating with ecosystem standards, Trifacta is encouraging the further adoption of data wrangling, which in turn encourages adoption of Hadoop and its central role as a Data Lake for the enterprise. IT departments can have confidence that Trifacta is protected by a proven security infrastructure even as the number of users continues to grow. In a way, making Trifacta easier for end users and making IT departments comfortable with its growing adoption are two sides of the same coin.
For more on Trifacta v3, I invite you to join us for live webinar on October 14th for a comprehensive overview and live demonstration of Trifacta v3. Those of you attending Strata are also encouraged to learn more about Trifacta v3 by stopping by our booth or one of our three sessions at the conference.