Main Concept
Model drift is the gradual degradation of a model’s performance over time as the real world changes but the model stays the same. A model that was accurate when deployed can become unreliable months or years later — not because something broke, but because the world it was trained on no longer matches the world it is predicting.
Context
After deployment, a model is essentially frozen — it keeps making predictions based on patterns it learned from historical training data. But the real world evolves: people’s behaviors change, markets shift, trends come and go. When the gap between the training data and current reality grows large enough, the model’s predictions become unreliable.
Key Idea
Model drift → the world changed, but the model did not.
Result → predictions that were accurate become unreliable over time.
Solution → monitor the model continuously and retrain periodically with fresh data.
Analogy: A travel guide from 10 years ago
Imagine using a travel guide written in 2010 to visit a city today. The map is mostly right, but some restaurants closed, new neighborhoods opened, and prices are completely wrong. The guide was accurate when written — it just did not age well.
A model is the same: trained on data from the past, gradually becoming less accurate as the present diverges from that past.
Two Types of Drift
Data Drift (Covariate Drift)
The statistical distribution of the input data changes over time — the inputs the model receives in production start looking different from what it was trained on.
Example
A fraud detection model was trained on data from 2020. By 2024, fraudsters use completely different techniques — the patterns in new transactions look nothing like the training data. The model never saw these patterns, so it misses them.
Concept Drift
The relationship between the input features and the correct output changes over time — the meaning of the data shifts even if the data itself looks similar.
Example
A clothing recommendation model was trained on 2015 fashion trends. In 2025, what counts as “stylish” has completely changed. The inputs (clothing attributes) look the same, but what they mean in terms of preference has shifted.
Why It Matters
Key Idea
Without monitoring → drift goes undetected until users notice bad predictions.
With monitoring → drift is caught early, model is retrained before users are impacted.
Retraining loop → correct predictions from production are fed back into training data to keep the model current.
AWS Services for Detecting Drift
| Service | Role |
|---|---|
| Amazon SageMaker Model Monitor | Continuously monitors deployed models for data quality issues and drift |
| Amazon CloudWatch | Tracks operational metrics and triggers alarms when performance degrades |
| Amazon SageMaker Clarify | Detects bias drift in model predictions over time |
Exam Scope
Model drift is not explicitly named in the exam guide, but the concepts behind it are directly tested under monitoring and MLOps. Expect scenario questions about why a previously accurate model is now making poor predictions — the answer is drift, and the solution is monitoring plus retraining.
Exam Domain
- Domain 1, Task Statement 1.3: “Understand fundamental concepts of MLOps (for example, model monitoring, model re-training).”
- Domain 1, Task Statement 1.3: “Identify relevant AWS services for each stage of an ML pipeline (for example, SageMaker Model Monitor).”