Variance (Model Fit)
Main Concept
Variance measures how much a model’s performance changes when trained on
different datasets with similar distribution. A high variance model is
overly sensitive to the specific data it was trained on.
Key Aspects
High Variance = Overfitting
- The model is very sensitive to changes in training data.
- Performs great on training data but poorly on unseen test data.
- It memorized the training data instead of learning the general pattern.
*
*The target diagram **
- The center (orange) = the truth (the correct prediction).
- The red dots = the model’s predictions.
- High variance means predictions are scattered all around the center —
the model is inconsistent, sometimes close, sometimes far. It reacts
differently depending on which data it saw.

Contrast with high bias:
- High bias = predictions clustered together but far from center (consistently wrong).
- High variance = predictions scattered around the center (inconsistently wrong).
How to reduce variance
- Feature selection — use fewer, more relevant features.
- Split data into training and test sets multiple times (cross-validation)
to ensure the model generalizes.
Bias vs Variance — The Core Tradeoff
This is one of the most fundamental tradeoffs in ML:
More complex model → lower bias, higher variance (overfitting risk)
Simpler model → higher bias, lower variance (underfitting risk)
Goal → find the sweet spot (balanced fit)
You can’t minimize both simultaneously — reducing one tends to increase
the other. The goal is to find the right balance.


Exam Domain
- Domain 1, Task Statement 1.1: basic AI terms including “bias” and “fit.”
- Domain 4, Task Statement 4.1: “understand effects of bias and variance
(for example, effects on demographic groups, inaccuracy, overfitting,
underfitting).”
Exam Scenarios to Recognize
| Scenario | Diagnosis |
|---|---|
| Performs well in training, poorly in production | High variance / overfitting |
| Performs poorly even on training data | High bias / underfitting |
| Sensitive to small changes in training data | High variance |
| Consistently predicts the wrong value | High bias |
Related Notes
- Bias
- Model Fit
- Feature Engineering
- Responsible AI / AI Bias