As the global leader in data wrangling, Trifacta continues to innovate by introducing new capabilities on major cloud platforms to drive analytics modernization in the cloud. Customers across the industry, including Bank of America, ABN AMRO Bank, and Pepsico, are using Trifacta on Azure to wrangle data and drive faster and better analytics insights. This blog summarizes the latest features we recently rolled out for using Trifacta on Microsoft Azure.
Support for Azure Data Lake Storage Gen 2 (ADLS Gen2)
Built on top of Azure Blob Storage , an object storage solution from Microsoft, ADLS Gen2 improves upon ADLS Gen1 to deliver massive scalability and cost benefits to enterprises building modern analytics stacks on Azure. Key features of ADLS Gen2 include Hadoop compatible data access via ABFS driver; enhanced security model allowing user to define POSIX permissions on directories or individual files; and hierarchical namespace to improve performance of directory management operations. ADLS Gen2 is optimized for cloud-scale analytics workloads.
Trifacta can now access data from and publish data to ADLS Gen2 natively. Using system mode authentication, users can access data in ADLS Gen2 securely with the combination of Azure directory ID, Azure application ID, and Azure secret to complete access.
The ability to store many exabytes of data as well as the improved performance offered in ADLS Gen2 will drive better analytics outcomes on Azure across all use cases, including faster BI reporting, enhanced data onboarding, and higher quality data for AI & ML use cases. No matter the use case, Trifacta’s native integration with ADLS Gen2 accelerates the most time-consuming part of the analytics workflow – data wrangling, allowing organizations to achieve faster time to insights and better decisions.
Trifacta now supports accessing and publishing data within Azure Databricks Table, and reading data from Delta Lake natively. Writing to Delta Lake will be coming out in the upcoming months. The platform allows users to ingest data from Databricks and Delta Lake for cleaning and then publish the analytics-ready output to a managed Databricks table.
Specifically, Databricks Delta Lake offers high reliability with ACID transaction on Spark to ensure data consistency; faster query performance with features such as data indexing and caching; simplified management by supporting both batch and streaming data in the same table. The seamless integration between the intelligent data wrangling solution from Trifacta and Databricks Delta Lake streamlines the entire data pipeline process, allowing data engineers, analysts, and other data professionals to quickly ingest, wrangle, orchestrate and publish high-quality data to support various analytics use cases on Azure effectively.
As a Microsoft One Commercial Partner, and a member of the Microsoft Partner Sales Connect network, Trifacta is tightly integrated with a rich set of Azure ecosystem services – including Azure Data Lake, Storage Blob, and Azure SQL Data Warehouse, processing engines such as Azure HDInsight and Azure Databricks, analytics services such as PowerBI and AzureML, as well as Azure Active Directory security service to provide elastic scalability, cost efficiency and security for organizations wrangling data on Azure.
With our latest support for ADLS Gen2 and Databricks Table and Delta Table, organizations migrating to the Azure cloud can expedite their analytics journey with automated data wrangling for these services to ensure clean, connected, secure and timely data is always available to meet their analytics needs.
To learn more about Trifacta, sign up to our free trial today!