January Legends: Corey & Jill
This month, we have not one but two Trifacta legends. Corey Leadbeater and Jill Somkul are colleagues at Accurate Testing & Inspection, a utility infrastructure firm in Southern California.
The GOAT of Data Prep: Trifacta Named 2020 Gartner Peer Insights Customers’ Choice Award for Data Preparation Tools
Big news to share—we’ve been named a Customers’ Choice in the December 2020 Gartner Peer Insights ‘Voice of the Customer’: Data Preparation Tools! Much like the G2 Grid or the Dresner Advisory Services report, the Gartner Peer Insights Customer Choice Awards are based upon the reviews of real technology users. We’re honored to continue receiving […]
The Hub and Spoke Model: How Trifacta Uses Trifacta for Product Analytics
This blog is a collaboration between Trifacta’s Sr. Product Manager, Cesar Jardim covering the Hub and Director of SaaS Adoption & Enablement, Connor Carreras covering the Spokes. If you are trying to manage data operations in your team, the hub and spoke model may be the right way to go. First introduced as a transportation […]
Flow Examples: A Little Help to Start Wrangling
Since its inception, Trifacta has been driven to simplify data preparation or all of the painstaking steps required to clean and prepare data for analytics. And we’re pretty proud of what we’ve come up with: an unparalleled data wrangling experience that leverages machine learning, visualization, and automation to accelerate data preparation for any data-driven initiative. […]
The Different Approaches to “T” in ELT and What’s Required to Drive Mass Adoption
Much has been written about the shift from ETL to ELT and how ELT enables superior speed and agility for modern analytics. One important move to support this speed and agility is creating a workflow that enables data transformation to be exploratory and iterative. Preparing data for analysis requires an iterative loop of forming and […]
3 Data-led Companies, 3 Data Warehouses: Why They All Chose Trifacta for Data Preparation
It’s safe to say that 2020 showed us that data warehouses have not only found a new home in the cloud but have cemented their position as the foundation of every organization’s data strategy moving forward. By 2022, Gartner predicts the overwhelming majority of all databases (75%) will be deployed or migrated to a cloud […]
Wrangle Summit Sneak Peek – The First Industry Event Focused on Data Engineering
Exponential growth, coupled with a whirlwind of change – this is how I would describe the past five years of the data and analytics industry. At the platform level, it seems like only yesterday Big Data was at its peak and we were watching many of the major platform providers go public. Now, it’s undeniable […]
December Legend: Alex Hardman
Alex is an avid Trifacta user and is championing Trifacta at Rezco Asset Management which was established in 1981 in South Africa with a deliberate focus on preserving capital and creating wealth.
Setting Up Data Quality Monitoring For Cloud Dataprep Pipelines
Build a simple, flexible, yet comprehensive Data Quality monitoring solution for your Google Cloud Dataprep by Trifacta pipelines with Cloud Functions, BigQuery and Data Studio Building a Data Quality Dashboard Building a modern data stack to manage analytic pipelines—such as Google Cloud and a BigQuery data warehouse or data lake—has many benefits. One such benefit […]
What Is ETL? ETL vs. ELT vs. Data Wrangling in the Cloud
Is ETL dead? Did ELT take over or is something new taking its place? It’s a question that has come up a lot in recent years as organizations modernize their analytics infrastructure. Huge shifts are underfoot in the analytics landscape and it isn’t always clear where this change leaves ETL. The short answer? No, ETL […]
Monitoring Data Quality Trends with Cloud Dataprep and Data Studio
Automatic data quality assessment is a Trifacta user favorite. Who wouldn’t want to give their eyes a rest from combing through data while Trifacta automatically points out possible data flaws? The feature is particularly useful when onboarding or integrating unfamiliar data. With unfamiliar data, it’s not only difficult to tell what errors might be lurking […]
What Is a Data Stack and How Does It Impact Analytics?
We hear a lot about organizations undergoing “data modernization” in order to become more data-driven. Essentially what that means is that these organizations have recognized that legacy data tools aren’t very good at solving modern data problems. They’re in the process of moving data out of legacy mainframe databases and, at the same time, replacing […]
Google Sheets: Data Validation Tips & Tricks
Google Sheets is one of the most widely-used spreadsheet tools. Still, many of its best features go undiscovered. Let’s take a closer look at how to do data validation in Google Sheets, which is commonly used to build drop-down lists. Why data validation matters Data validation is like the analytic version of copyediting. As much […]
Orchestrate Your Data Pipelines on Trifacta Using Plans
Why create a plan? The short answer is to operationalize and automate your data pipelines on Trifacta.
Easily Publish to Data Warehouses with New Rename Functions in Trifacta
Chances are you’re having to work with several different databases and data warehouses in your analytics stack. It just is what it is today. In order to get an accurate picture in your reporting you have to use everything. However, working with these different database can be like, well this: When publishing tables in different […]
4 Key Steps for a Data Sanity Check
As a Customer Success Manager at Trifacta, I spend most of my time helping our customers wrangle their raw, big data into business insights. On these data wrangling projects, it’s tempting to jump straight into the most interesting problems but to produce the most accurate results, we should start by performing a set of basic […]
Structured vs. Unstructured Data: What’s the Difference?
Structured and unstructured data are both used extensively in data analysis but operate quite differently. Let’s take a closer look at these two data formats to understand just how different structured data and unstructured data are. Structured data vs unstructured data Searchability is often used to differentiate between structured vs unstructured data. Structured data typically […]
How to Automatically Deploy a Google Cloud Dataprep Pipeline Between Workspaces
This article explains how to use Cloud Composer to automate Cloud Dataprep flow migration between two workspaces. This process can be leveraged for your Cloud Data Warehouse project to move from development, test, and production following what is known as Continuous Integration and Continuous Delivery (CI/CD) pipeline in agile development. At a high level, this […]