See How Data Engineering Gets Done on Our Do-It-Yourself Data Webcast Series

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
Summer of SQL

A Q&A Series with Joe Hellerstein

See why SQL is Back




Sanofi launched Project Data Sphere to share, integrate and analyze the collective body of cancer research in a consolidated data hub. This brings together researchers and clinicians to leverage new sources of data in order to explore better cancer treatment options.


  • Sanofi needs siloed data to be curated and shared across multiple business entities for transparency, improved decision-making, and a more universal health care consumer vision.
  • Healthcare data, including clinical, biomarker, commercial consumer (OTC drug) data is messy and noisy, which required individuals with medical backgrounds to make correlations.

Solution with Trifacta

  • Trifacta enables Sanofi’s non-programming scientists and data analysts to clean and curate numerous and complex data sets from a variety of regions, internationally.
  • Data curation is now accelerated; the PDS team spent 2 years standardizing data that now takes only hours.

Company Background

Sanofi, a global healthcare leader, discovers, develops and distributes therapeutic solutions focused on patients’ needs. Sanofi has core strengths in diabetes solutions, human vaccines, innovative drugs, consumer healthcare, emerging markets, animal health and Genzyme. For more information visit

“One of the reasons we’ve partnered closely with Trifacta is they’ve proven time and again that they have the internal expertise to work with us on our data wrangling, and if it’s in a new domain, they’re more than willing to learn.”