Trifacta Teams with Databricks to Extend Trifacta’s Data Transformation Platform to Spark
Trifacta Data Transformation Platform v2 includes execution through Spark as a data processing engine for Hadoop data transformation
San Francisco, CA – October 9, 2014 – Trifacta, a leading Data Transformation Platform provider, today announced it has teamed with Databricks, the driving force behind Apache Spark, to enable data transformation in Hadoop through Apache Spark. As part of the partnership, the Trifacta Data Transformation Platform has been Databricks-certified to work with Apache Spark, the popular open-source processing engine that is the most active project in the Big Data ecosystem with over 325 contributors in the past 12 months alone.
The “Certified on Spark” Program ensures that certified applications will work with a multitude of commercially supported Spark distributions. Certifying with Databricks allows the Trifacta Data Transformation Platform v2 to run on top of all certified Spark distributions, including Blue Data, DataStax, Guavus, Hortonworks, IBM, Oracle, Pivotal, SAP and Stratio.
“Spark provides a unified platform that spans the spectrum of data processing needs; enterprises are using it in production for small and petabyte scale workloads, interactive and batch, simple and sophisticated analytics,” said Ion Stoica, CEO of Databricks. “We’re excited to team with Trifacta and bring their intuitive data transformation technology to Spark users, enabling greater effectiveness and productivity.”
With Trifacta v2, Trifacta Data Transformation Platform users can process data of any volume. Users can scale their data transformations from immediate execution on small data through interactive execution on medium sized data, all the way to execution of terabytes to petabytes. The Trifacta platform now addresses the complete range of processing use cases available on Hadoop.
“The architecture of the Trifacta Data Transformation Platform automatically translates predictive interactions into scripts or code that can be pushed down into a variety of standard data processing engines,” said Sean Kandel, CTO, Trifacta. “Trifacta v1 included an intelligent selection between immediate in-browser execution and MapReduce-based execution of transformation scripts. Trifacta v2 adds the ability to leverage the full capabilities of the Spark platform.”
Data transformation is a key bottleneck in gaining valuable insights from big data. Data scientists and analysts spend as much as 80 percent of their time preparing and transforming data rather than focusing on analysis. Trifacta allows enterprises to quickly standardize, integrate and cleanse their data, enabling analysts to focus on actionable insights. With the latest version of the platform, that productivity has been extended to data of all sizes by incorporating Spark’s capabilities.
- Learn more about Trifacta
- Read more on the Trifacta blog: http://trifacta.com/blog/
- Follow us on Twitter: https://twitter.com/trifacta
- Become a fan on Facebook: https://www.facebook.com/Trifacta
- Connect on LinkedIn: http://www.linkedin.com/company/trifacta
Trifacta, the pioneer in data transformation, significantly enhances the value of an enterprise’s Big Data by enabling users to easily transform raw, complex data into clean and structured formats for analysis. Leveraging decades of innovative work in human-computer interaction, scalable data management and machine learning, Trifacta’s unique technology creates a partnership between user and machine, with each side learning from the other and becoming smarter with experience. Trifacta is backed by Accel Partners, Greylock Partners, and Ignition Partners.
Nolan Necoechea for Trifacta