LinkedIn has one of the largest Hadoop clusters on the planet, giving them the potential to build new products to increase monetization. Their products, advertisements and features such as “People You May Know” and “How You Rank” are data-driven and rely on huge quantities of usage data.


  • LinkedIn compressed all usage data with the Avro serialization system.
  • There needed to be a more efficient way to prepare Avro in order to get workable data in the hands of analysts faster.

Solution with Trifacta

  • LinkedIn’s ETL developers and analysts can now quickly unbox and prepare data, allowing them to productively work with Avro data.
  • More analysts can work with diverse LinkedIn network data in Hadoop to experiment manipulating the data to create new products and services.

Company Background

Founded in 2003, LinkedIn connects the world’s professionals to make them more productive and successful. With over 300 million members worldwide, including executives from every Fortune 500 company, LinkedIn is the world’s largest professional network on the Internet. The company has a diversified business model with revenue coming from Talent Solutions, Marketing Solutions and Premium Subscriptions products. Headquartered in Silicon Valley, LinkedIn has offices across the globe.