Data Integration and Data Integration Techniques

Data integration is the process of gathering data from multiple locations and combining it into one view. It’s the process of consolidating data with the intent of providing consistent access and delivery of the information. Data integration can include the other processes of data cleansing, mapping and transformation as well since it can be incorporated in the data preparation process.

There are many techniques data analysts use to integrate data from multiple sources. These are some of the most prominent techniques used in data integration:

Data replication. One dataset is replicated across other datasets to keep the information in it synchronized and usable for backup.

Data virtualization. Data in multiple datasets is virtualized and loaded into a unified dataset. This technique makes it redundant to load the data into a new repository.

Streaming data integration. Multiple data streams are continuously integrated into a data analytics system instead of loading data afterwards. This is a real-time technique.