Main Concept
Hyperparameters are settings that control HOW the model learns β not WHAT it learns. They are set by the engineer before training begins and do not change automatically during training.
Hyperparameter tuning is the process of finding the best combination of these settings to get the best model performance.
The Key Distinction
Key Idea
Model Parameters β WHAT the model learns (weights, biases). Learned automatically from data during training. You never touch these.
Hyperparameters β HOW the model learns. Set manually before training begins. You tune these.
Analogy: Baking a cake
Hyperparameters = the oven settings you choose before baking (temperature, time, rack position).
Model parameters = how the cake actually turns out (texture, density, flavor).
You control the oven settings. The oven (training process) determines the result.
The Four Key Hyperparameters
Learning Rate
Controls how big or small the adjustments to weights are after each training step.
Key Idea
High learning rate β fast but overshoots the optimal solution.
Low learning rate β precise but slow convergence.
Right learning rate β fast enough to converge, precise enough to find the optimal solution.
Analogy: Learning to ride a bike
High learning rate = you overcorrect every wobble β fast but you keep falling over.
Low learning rate = you make tiny corrections β very stable but takes forever to learn.
Right learning rate = just enough correction to stay balanced without taking too long.
Batch Size
How many training examples the model looks at before updating its weights.
Key Idea
Small batch β more frequent updates, more stable learning, slower to process.
Large batch β fewer updates, faster to process, less precise adjustments.
Analogy: Grading exams to adjust your teaching style
Small batch = grade 5 exams, then adjust your teaching β more frequent adjustments, takes longer.
Large batch = grade 100 exams, then adjust β faster to process, but updates may be less precise.
Number of Epochs
How many times the model goes through the entire training dataset.
Key Idea
Too few epochs β model did not learn enough β underfitting.
Too many epochs β model memorized training data β overfitting.
Just right β model learned the patterns without memorizing.
Analogy: Studying a textbook before an exam
Too few epochs = you only read the book once β you did not learn enough.
Too many epochs = you read the book 100 times β you memorized every word but cannot apply concepts to new problems.
Just right = you read enough times to understand, not so many times that you just memorize.
Regularization
Controls how simple or complex the model is allowed to be. Higher regularization forces the model toward simpler solutions, which reduces overfitting.
Key Idea
No regularization β complex model, higher risk of overfitting.
High regularization β simpler model, more generalizable.
Key exam rule β if the model is overfitting, increase regularization.
Analogy: Writing an essay
No regularization = you can use any vocabulary, any length, any structure β complex, possibly overfitted to studied examples.
High regularization = you must keep it simple, clear, concise β less likely to overfit, more generalizable.
How They Connect to Overfitting
Key Idea: How to fix overfitting
Best exam answer β increase training data size.
Secondary fixes β reduce epochs (early stopping), increase regularization, reduce model complexity.
Additional options β data augmentation, ensembling (combine multiple models).
Last resort β adjust learning rate or batch size.
How Tuning Is Done
- Manual approach: Grid search tries every possible combination of values β exhaustive but slow. Random search randomly samples combinations β faster and surprisingly effective.
- AWS approach: Amazon SageMaker Automatic Model Tuning (AMT) automates the search for optimal hyperparameter values β you define the range, SageMaker finds the best combination.
Exam Scope
The exam will NOT ask you to perform hyperparameter tuning.
It WILL ask you to recognize what hyperparameters are and how they differ from model parameters, know what each hyperparameter controls at a conceptual level, identify the relationship between hyperparameters and overfitting, know that Amazon SageMaker AMT automates this process, and know that the best answer for fixing overfitting is increasing training data β not tuning hyperparameters.
Note: βPerforming hyperparameter tuning or model optimizationβ is explicitly listed as an OUT OF SCOPE job task in the exam guide β you describe it, you do not do it.
AWS Services
| Service | Role |
|---|---|
| Amazon SageMaker | Model training environment |
| SageMaker Automatic Model Tuning (AMT) | Automated hyperparameter optimization |
Exam Domain
- Domain 1, Task Statement 1.3: βDescribe components of an ML pipeline (for example, model training, hyperparameter tuning, evaluation).β
- Domain 1, Task Statement 1.3: βIdentify relevant AWS services for each stage of an ML pipeline (for example, SageMaker).β