site |
---|
sandpaper::sandpaper_site |
This workshop introduces you to foundational workflows in Amazon SageMaker, covering data setup, code repo setup, model training, and hyperparameter tuning within AWS's managed environment. You’ll learn how to use SageMaker notebooks to control data pipelines, manage training and tuning jobs, and evaluate model performance effectively. We’ll also cover strategies to help you scale training and tuning efficiently, with guidance on choosing between CPUs and GPUs, as well as when to consider parallelized workflows (i.e., using multiple instances).
To keep costs manageable, this workshop provides tips for tracking and monitoring AWS expenses, so your experiments remain affordable. While AWS isn’t entirely free, it's very cost-effective for typical ML workflows—training roughly 100 models on a small dataset (under 10GB) can cost under $20, making it accessible for many research projects.
Currently, this workshop does not include:
- AWS Lambda for serverless function deployment,
- MLFlow or other MLOps tools for experiment tracking,
- Additional AWS services beyond the core SageMaker ML workflows.
If there’s a specific ML workflow or AWS service you’d like to see included in this curriculum, we’re open to developing more content to meet the needs of researchers and ML practitioners at UW–Madison (and at other researcher institutes). Please contact [email protected] with suggestions or requests.