See How Data Engineering Gets Done on Our Do-It-Yourself Data Webcast Series

Start Free

Speed up your data preparation with Trifacta

Free Sign Up
 

Data drift refers to a change in data structure or meaning that can occur over time and cause machine learning models to break. It occurs frequently when ML models seek to describe continually changing (dynamic) circumstances or environments. For example, a ML model could be trained to identify reckless driving against a baseline dataset captured when the speed limit in an area is 65 mph. However, if the speed limit in that same area is increased to 80 mph, the operating assumptions and the meaning of individual data points change, invalidating the existing ML model. Therefore, data drift occurs when there is a difference in the meaning of data between the baseline dataset on which a model is trained and the current real-time data inputs.

Explore Trifacta Today
More Data Engineering Terms
MLOps ML Pipeline Data Quality
Next Term
Data Engineer