See How Data Engineering Gets Done on Our Do-It-Yourself Data Webcast Series

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
 

Data Ingestion

All Glossary Terms

Data ingestion is the process of transporting data from its original source to a data storage medium, such as a data lake, data mart, or data warehouse. In data ingestion, data can come from a wide variety of sources, such as clickstreams, spreadsheets, sensors, APIs, or other databases, to name a few. This source data is ingested into storage to allow for further processing, use, and analysis. Data ingestion implies that data is being brought in from an outside source, which means it differs slightly from data integration, where the sources are all internal.

Data can be ingested as large bundles at one time through batch processing, or stream processing can be used to pipeline individual data records into data storage as soon as they become available. Since data can be ingested from a variety of sources in a variety of formats, it must almost always be cleaned and transformed before it can be used for analysis or machine learning - regardless of whether it arrives in batches or stream processing.

How Trifacta Helps with Data Ingestion

Trifacta’s data engineering cloud platform significantly reduces the time, technical skills, and costs required to build and automate the data pipelines that data ingestion feeds. Using Trifacta, data teams can dramatically speed up the process of cleaning and transforming ingested data - either as an automated part of the ingestion process, or for ad hoc analysis at a later date.

Explore Trifacta Today