What is Amazon SageMaker?

  • SageMaker is AWS’s platform for building your own custom AI/ML models from scratch.
  • It contains all the tools you need to:
    • Prepare your data
    • Build a model
    • Train the model
    • Test It
    • Deploy it to production
  • SageMaker has different versions for different skill levels (Canvas, Autopilot, Studio)
  • Depending the steps you are in the process of building your model, SageMaker has different tools that can help you in the process.

Key components

SageMaker Studio

  • Integrated development environment (IDE) for ML
  • Jupyter notebooks
  • Visual tools for ML workflow

SageMaker Canvas

  • No-code ML
  • Visual interface
  • AutoML capabilities
  • Business analysts can build models

SageMaker Data Wrangler

  • Data preparation
  • Visual data transformation
  • Feature engineering

SageMaker Autopilot

  • Automated ML (AutoML)
  • Automatically builds, trains, and tunes models
  • Generates explainable models

SageMaker Training

  • Distributed training
  • Built-in algorithms
  • Bring your own algorithm
  • Spot instance training

SageMaker Inference

  • Real-time endpoints
  • Batch transform
  • Serverless inference
  • Multi-model endpoints

SageMaker Pipelines

  • ML workflow orchestration
  • CI/CD for ML (MLOps)

SageMaker Feature Store

  • Centralized feature repository
  • Online and offline feature storage

SageMaker Model Monitor

  • Monitor model performance
  • Detect data drift
  • Model quality monitoring

SageMaker JumpStart

  • Pre-trained models
  • Solution templates
  • One-click deployment of popular models

High level process of building a model

Data Preparation & Feature Engineering

  • You have: Raw messy data (CSV files, images, text, etc.)
  • SageMaker helps: Clean it, transform it, organize it
  • Tools:
    • SageMaker Data Wrangler
    • SageMaker Feature Store

Build the Model

  • You want: A Model that predicts something
  • SageMaker provides:
    • Pre-build algorithms (ready recipes)
    • Option to write your own code
    • AutoML (automatic model building)
  • Tools depends of your skills
    • SageMaker Canvas (No Code - Visual Interface)
    • SageMaker Autopilot (Low Code - Automated ML)
    • SageMaker Studio (Full Code - Complete Control)
    • SageMaker JumpStart (Pre-built Models)

Train the Model

  • You want: Teach (train) the model by showing examples
  • SageMaker provides: Power computers (GPUs) to do the training
  • Tool: SageMaker Training Jobs

Test & Evaluate

  • You want: Test if the model actually works well
  • SageMaker provides: Tools to measure accuracy
  • Tools:
    • SageMaker Model Monitor (for production monitoring)
    • SageMaker Model Evaluation (Built into SageMaker Training Jobs)
    • SageMaker Clarify (for bias and fairness)
    • SageMaker Experiments
    • SageMaker Model Cards
    • SageMaker Debugger

NOTE

Evaluation happens at multiple stages (during the training, after training, in production)

Deploy to Production

  • You want: Make your model available for real use
  • SageMaker provides: Hosting infrastructure
  • Tool: SageMaker Endpoints (Part of SageMaker Inference)