For those who work with data regularly, the problem of “Data Wrangling” can be one of the most frustrating aspects of performing analysis. My first exposure to the real pain associated with data wrangling was my work in quantitative research at Citadel Investment Group.
My work revolved around data but much of the data relevant to our analysis was not always in the appropriate format or structure required by my analytic tools. I began creating a library of “transformation” scripts to prepare data for the models and tools I was leveraging at the time. Eventually, other analysts took notice and started coming to me to get their hands on one of these scripts. More often, they would entice me to write a script for the specific data they needed. Simply visualizing the data was challenging because of the raw, unstructured format of the data sets. After a few months of this, I realized that the problem of “data wrangling” went well-beyond the specific work I was doing at the time and was a widespread problem worthy of further investigation.
This experience led me to graduate school at Stanford where I drove the development of Data Wrangler which formed the foundation of what we’re now doing at Trifacta. Throughout researching this process at Stanford and Berkeley and now speaking with customers at Trifacta, the downstream tool we most commonly hear about during these conversations is Tableau.
This history I have had with Tableau and watching them first-hand become a leader in democratizing the use of data, makes today’s partnership announcement especially exciting. With the release of Trifacta 1.5, we have added the ability to write the output of Trifacta data transformations directly to a Tableau Data Extract format which will enable Tableau users to “wrangle” an entirely new set of raw data sources previously not fit for analysis in Tableau.
In addition to integrating natively with Tableau Data Extracts, with the release of Trifacta 1.5, we also deliver native integration with Hadoop’s HCatalog. Tableau users will be able to point those implementations directly to the output of Trifacta in HCatalog, delivering a more accessible, interactive experience with big data in Tableau.
To see the integration we’re announcing today, check out how someone would use Trifacta to wrangle data for Tableau:
Demo of Using Trifacta & Tableau