Main Concept

Hyperparameters are settings that control HOW the model learns β€” not WHAT it learns. They are set by the engineer before training begins and do not change automatically during training.
Hyperparameter tuning is the process of finding the best combination of these settings to get the best model performance.

The Key Distinction

Key Idea

  • Model Parameters β†’ WHAT the model learns (weights, biases). Learned automatically from data during training. You never touch these.

  • Hyperparameters β†’ HOW the model learns. Set manually before training begins. You tune these.

Analogy: Baking a cake

  • Hyperparameters = the oven settings you choose before baking (temperature, time, rack position).

  • Model parameters = how the cake actually turns out (texture, density, flavor).

You control the oven settings. The oven (training process) determines the result.

The Four Key Hyperparameters

Learning Rate

Controls how big or small the adjustments to weights are after each training step.

Key Idea

  • High learning rate β†’ fast but overshoots the optimal solution.

  • Low learning rate β†’ precise but slow convergence.

  • Right learning rate β†’ fast enough to converge, precise enough to find the optimal solution.

Analogy: Learning to ride a bike

  • High learning rate = you overcorrect every wobble β†’ fast but you keep falling over.

  • Low learning rate = you make tiny corrections β†’ very stable but takes forever to learn.

  • Right learning rate = just enough correction to stay balanced without taking too long.

Batch Size

How many training examples the model looks at before updating its weights.

Key Idea

  • Small batch β†’ more frequent updates, more stable learning, slower to process.

  • Large batch β†’ fewer updates, faster to process, less precise adjustments.

Analogy: Grading exams to adjust your teaching style

  • Small batch = grade 5 exams, then adjust your teaching β†’ more frequent adjustments, takes longer.

  • Large batch = grade 100 exams, then adjust β†’ faster to process, but updates may be less precise.

Number of Epochs

How many times the model goes through the entire training dataset.

Key Idea

  • Too few epochs β†’ model did not learn enough β†’ underfitting.

  • Too many epochs β†’ model memorized training data β†’ overfitting.

  • Just right β†’ model learned the patterns without memorizing.

Analogy: Studying a textbook before an exam

  • Too few epochs = you only read the book once β†’ you did not learn enough.

  • Too many epochs = you read the book 100 times β†’ you memorized every word but cannot apply concepts to new problems.

  • Just right = you read enough times to understand, not so many times that you just memorize.

Regularization

Controls how simple or complex the model is allowed to be. Higher regularization forces the model toward simpler solutions, which reduces overfitting.

Key Idea

  • No regularization β†’ complex model, higher risk of overfitting.

  • High regularization β†’ simpler model, more generalizable.

  • Key exam rule β†’ if the model is overfitting, increase regularization.

Analogy: Writing an essay

  • No regularization = you can use any vocabulary, any length, any structure β†’ complex, possibly overfitted to studied examples.

  • High regularization = you must keep it simple, clear, concise β†’ less likely to overfit, more generalizable.

How They Connect to Overfitting

Key Idea: How to fix overfitting

  • Best exam answer β†’ increase training data size.

  • Secondary fixes β†’ reduce epochs (early stopping), increase regularization, reduce model complexity.

  • Additional options β†’ data augmentation, ensembling (combine multiple models).

  • Last resort β†’ adjust learning rate or batch size.

How Tuning Is Done

  • Manual approach: Grid search tries every possible combination of values β€” exhaustive but slow. Random search randomly samples combinations β€” faster and surprisingly effective.
  • AWS approach: Amazon SageMaker Automatic Model Tuning (AMT) automates the search for optimal hyperparameter values β€” you define the range, SageMaker finds the best combination.

Exam Scope

The exam will NOT ask you to perform hyperparameter tuning.

It WILL ask you to recognize what hyperparameters are and how they differ from model parameters, know what each hyperparameter controls at a conceptual level, identify the relationship between hyperparameters and overfitting, know that Amazon SageMaker AMT automates this process, and know that the best answer for fixing overfitting is increasing training data β€” not tuning hyperparameters.

Note: β€œPerforming hyperparameter tuning or model optimization” is explicitly listed as an OUT OF SCOPE job task in the exam guide β€” you describe it, you do not do it.

AWS Services

ServiceRole
Amazon SageMakerModel training environment
SageMaker Automatic Model Tuning (AMT)Automated hyperparameter optimization

Exam Domain

  • Domain 1, Task Statement 1.3: β€œDescribe components of an ML pipeline (for example, model training, hyperparameter tuning, evaluation).”
  • Domain 1, Task Statement 1.3: β€œIdentify relevant AWS services for each stage of an ML pipeline (for example, SageMaker).”