See How Data Engineering Gets Done on Our Do-It-Yourself Data Webcast Series

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
 

Trifacta FAQs

Answers to Frequently Asked Questions

 

Overview

 
What is the Trifacta Data Engineering Cloud?

Trifacta is an intelligent, collaborative, self-service data engineering cloud platform to transform data, ensure quality, and automate data pipelines, enabling consumable data at any scale.

 
What is Data Engineering?

Data Engineering is a discipline to transform raw data, assess, validate quality, and deliver usable data. With data engineering, you cover the entire spectrum of transforming raw data into consumable data with steps to connect, prepare, profile, validate, and deploy high-quality, valuable data into production.

 
Who is the target audience for Trifacta?

Trifacta caters to the modern data worker. In other words, Trifacta is valuable for anyone who works with data including analysts, engineers, and scientists. Trifacta addresses the requirements of each of these personas with its unique, superior capabilities.

 
How do I get started with Trifacta?

You can get started with Trifacta for free with a 30-day trial. As part of the trial, you can experience advanced capabilities of data engineering to transform your raw data into high-quality, consumable data. You can start the free trial on Trifacta if your data is on cloud platforms such as AWS, Azure, Databricks, and Snowflake as well as on-premise systems, or on Google Cloud with Dataprep by Trifacta if your data is in the Google ecosystem.

With the trial, you can also use Trifacta’s pre-built templates for common use cases that you can leverage right away. These templates provide ready-made data flows and recipes that you can use as-is or customize with your own data to achieve the desired results.

 
How does Trifacta work with code?

Trifacta gives you complete flexibility when it comes to code. For users who do not wish to use code, Trifacta provides intelligent, visual transformations, assesses data sets, and schedules the publication of data without writing a single line of code. If you want to use code, Trifacta allows you to bring your own code in a language of your choice including SQL and Python, providing complete integration with your code.

 
Which cloud platforms are supported by Trifacta?

Trifacta is a SaaS offering on all leading cloud platforms including Google Cloud, AWS, Microsoft Azure, Databricks, and Snowflake. Trifacta offers a seamless experience on all these platforms.

 
How does Trifacta fit into the modern data stack including ELT and ETL processes?

Trifacta plays a key role in the modern data stack and can be used throughout the modern ELT architecture as well as the transition from ETL to ELT. Trifacta supports the modern ELT approach handling the individual processes for Extract, Load, and Transform, where transformation is about preparing data and decoupling these processes. This helps with less friction, better control, flexibility, and full transparency. Trifacta enables the modern data stack with its self-service architecture and the ability to serve entities such as the modern cloud data warehouse and the unified data warehouse/data lake construct.

 

Data Transformation

 
What is Data Transformation?

Data Transformation is the process of converting raw data into useful data for advanced insights and analytics.

 
How does Trifacta deliver data transformation?

Trifacta uses advanced techniques such as AI/ML with a visual interface to detect complex data patterns from any source and transform these patterns into useful data. Trifacta allows you to preview the suggested data transformations before you commit the changes in an easy, guided, and quick manner.

 
Where can the transformed data from Trifacta be used?

The transformed, high-quality data produced by Trifacta can be used in various repositories and applications including cloud data warehouses, AI/ML systems, data pipelines, and applications related to smart analytics and reporting. In general, the output from Trifacta can be used wherever high-quality data is required.

 
Which data sources can I connect to, from Trifacta?

Trifacta offers Universal Data Connectivity to a wide range of data sources both on-premises and in the cloud. With a self-service architecture, Trifacta provides flexible and seamless access to data to support a range of use cases and applications with REST, XML, and JDBC frameworks. Learn more from our integrations page.

 
Can I share my data transformations with my team?

Yes, Trifacta enables data democratization by allowing you to share and collaborate any step of the data engineering process including intermediate and final transformations with your teams. You can comprehend the immediate impact with live and continuous validation from your teams and co-workers. Additionally, you can also share and leverage the larger Trifacta community where you can share, exchange, and learn best practices.

 
How does Trifacta integrate with data pipelines?

Trifacta helps you build automated data pipelines at any scale. Using Trifacta’s automation and orchestration capabilities, you can operationalize self-service data pipelines quickly and easily. This can help you streamline your data operations enabling efficiency and transparency.

 

Data Profiling & Quality

 
What is data profiling?

Data profiling is the process of evaluating the content and quality of data. Data profiling helps determine the accuracy, completeness, and validity of a given dataset.

 
How does Trifacta perform data profiling?

Trifacta’s visual, interactive interface is built-in with powerful data profiling capabilities. The interface provides visually compelling representations of the dataset, where Trifacta automatically identifies dataset formats, schemas, and specific attributes and relationships, along with associated metadata for each dataset.

 
What are the benefits of data profiling capabilities provided by Trifacta?

The superior data profiling capabilities from Trifacta enable quick identification of problems in your dataset, as well as provide actionable insights throughout the lifecycle of your data project. Beyond identification, Trifacta provides pattern profiling that alerts anomalous patterns within each data type and suggests script transforms to rectify irregular or incongruent data. Trifacta enables you to automate this complete process.

 
How does Trifacta maintain high quality?

Trifacta delivers Adaptive Data Quality (ADQ) capabilities with Data Quality Rules, which prevent incorrect or dirty data from contaminating your data project recipes. These rules help you determine whether the current data is fit for use, and what additional data transformations are needed to correct the data. Trifacta constantly monitors all data including existing, updated, and new data to apply ADQ rules, to maintain high quality.

 
Why should I use Data Quality Rules?

The quality of an application or a report is only as good as the input data that it uses. Data Quality Rules provide an automated way to identify flaws in your data and build quality indicators to monitor its remediation. These rules are automatically updated to reflect any changes and takes care of any inadvertent errors caused by human intervention.

 
How do Data Quality Rules work?

Data Quality Rules are part of the Trifacta interface, where Trifacta automatically suggests a series of rules to validate various aspects of the quality of your data. Looking at the suggestions, you can accept, ignore, edit, or update these rules to ensure they are fit for your particular use case or dataset. You can also add your own rules using the Trifacta Wrangler language to build any additional validations you may need for your use case or application.

 

What security measures are implemented with Trifacta?

 
What security measures are implemented with Trifacta?

Trifacta is architected with data security as the highest priority. A secure connection is always maintained for your data between your source and target systems. Trifacta stores transformation logic in the form of metadata within an encrypted relational database and the actual data is not stored in Trifacta. Users can access only the data they have access to, based on user permissions defined by the administrator in your organization. Because Trifacta provides a single point of access to prepare and transform your data, you can establish a robust self-service analytics governance ecosystem.

 
Where is the data stored and processed?

The data is stored and processed within the cloud service provider. If you use Google Cloud, the data is stored in the Google Cloud project and does not persist within Trifacta. If you use other service providers such as AWS, the data is stored in your own Amazon S3 bucket within your AWS account. AWS EMR is used to execute data preparation jobs but is not persisted in Trifacta’s AWS account.

 
How is user authentication and authorization managed?

On Google Cloud, user authentication is managed by Google Cloud IAM services. User passwords are never stored by Trifacta. Data authorization to other Google Cloud services such as Google Cloud Storage (GCS), BigQuery, or Google Sheets is managed by Google authorization services. 

For other cloud service providers such as AWS, user authentication is managed by Trifacta. However, Trifacta does not store any user account credentials including passwords. Authorization to files stored in Amazon S3 buckets is managed by the user’s IAM credentials.

 
Is the data encrypted when at rest or in motion?

Data at rest

Google Cloud: For data at rest, storage is managed by you and you control the level of encryption. Sample data, intermediate files, job results are stored in your own Google Cloud Storage and you control the level of encryption. The metadata is stored in a Google Cloud SQL database with AES-256 encryption.

Other service providers: Taking AWS as an example, data is encrypted using AWS KMS and the data storage is managed by Trifacta. Sample data, intermediate files, job results are stored in your own Amazon S3 bucket and you control the level of encryption. You can also choose to have Trifacta store and manage sample data, intermediate files, and job results. If you choose this option, it is stored in an Amazon S3 bucket owned by Trifacta, and encryption is done using Amazon S3’s SSE-KMS option, with a KMS key managed by Trifacta. The metadata is stored with AES-256 encryption.

Data in motion

Google Cloud: For data in motion, dataflow configuration is managed by you and you control the level of encryption. Browser communication is encrypted with Transport Layer Security (TLS) and all API communication between Google Cloud services is encrypted using TLS.

Other service providers: Taking AWS as an example, AWS EMR configuration is managed by Trifacta. Transit between Amazon S3 and the EMR cluster is encrypted using TLS. Trifacta doesn’t persist your data on the EMR cluster other than the duration of the job. Browser communication is encrypted with Transport Layer Security (TLS) and all API communication between services is encrypted using TLS.

 
Is Trifacta certified with industry standard certifications?

Trifacta is SOC 2 Type II certified. We believe in the highest level of confidentiality for our customers and ensure security, performance, and reliability are maintained at all times.

Trifacta is also compliant with the General Data Protection Regulation (GDPR) guidelines and requirements regarding the collection, use, and retention of personal information.

Trifacta acknowledges the importance of protected health information (“PHI”) as defined in 45 CFR 160.103. The Trifacta Solution is designed so that Trifacta does not require any access to any PHI processed by the Customer using the Trifacta Solution and PHI is not stored within Trifacta’s environment. As a result, the parties do not anticipate that Trifacta will have any access to Customer PHI in the course of providing the Trifacta Solution. Trifacta is, nevertheless, willing to enter into a mutually agreed business associate agreement for the purposes of complying with the Health Insurance Portability and Accountability Act of 1996 (“HIPAA”), Public Law 104-191, the Health Information Technology for Economic and Clinical Health Act (the “HITECH Act”), Public Law 111-005, and the regulations promulgated thereunder.

 

Pricing

 
How does pricing work with Trifacta?

Trifacta offers a flexible pricing model based on two vectors namely productivity and scale. You are charged based on the number of users and the amount of data processed. You can choose a monthly payment option or an annual payment option that offers a 20% discount on the monthly price. Additionally, you can also choose from three different editions offering you a range of capabilities depending on your business needs.

 
What are the three pricing editions offered by Trifacta?

Trifacta offers three pricing editions namely Starter, Professional, and Enterprise. With the Starter edition, you can get started with data engineering with basic capabilities including data transformation, connectivity to cloud data warehouses, data profiling, and sharing and collaboration. The Professional edition offers all these capabilities along with additional capabilities such as Adaptive Data Quality (ADQ) Rules, Universal Data Connectivity to a range of connectors, Automation and Orchestration, REST API endpoints, and a dedicated customer support representative. The Enterprise edition offers all of the capabilities from Starter and Professional, along with security controls such as fine-grained access control and single-sign on.

 
Which edition is offered during the Trifacta Free Trial?

During the 30-day free trial, Trifacta offers all the capabilities on the Professional Edition.

 
Who is defined as a user when it comes to pricing with Trifacta?

Any individual who logs in to Trifacta with a unique email address is counted as a user of Trifacta.

 
Can I change my pricing edition?

Yes, you can upgrade or downgrade to a different edition at any time. If you choose to upgrade your edition, you’ll pay a prorated amount for that edition for the rest of the month from the time you upgrade. If you choose to downgrade your edition, you’ll be credited with the remaining amount on your bill in the following month.

 
Can I cancel my subscription to Trifacta?

You may choose to cancel at any time. Cancelation will take effect at the end of your current billing cycle.

 
Can I use my credit card for payment?

Yes, Trifacta offers self-service credit card payments.

 
How can I estimate my consumption charges outside the per-user charges?

During the free trial, you can connect and prepare your data similar to your production environment. The workspace usage page can help you with a display of your consumption, which you can use to estimate your consumption charges. We are working on a calculator that will help you with a more accurate estimate of your charges. The pricing page will include this calculator.