GPU Systems Learning

Overview

This repository documents my structured learning path toward GPU and ML systems engineering, with a focus on high-performance computing, CUDA programming, and ML infrastructure. The journey combines theoretical learning with hands-on projects, covering modern C++, CUDA, computer architecture, and ML systems design.

Learning Objectives

Master modern C++ and GPU programming principles
Develop expertise in CUDA and parallel computing
Understand ML system architecture and optimization
Build practical experience with distributed ML systems
Gain deep knowledge of computer architecture

Project Structure

C++ Practice

Implementation exercises focusing on:

Memory management and RAII
Template metaprogramming
Concurrent programming
Performance optimization

CUDA Projects

Hands-on GPU programming including:

Matrix operations library
Custom ML kernels
Performance profiling
Multi-GPU implementations

ML Systems

End-to-end ML infrastructure projects:

Inference engine
Distributed training system
Model serving infrastructure
Performance optimization tools

Study Notes

Detailed notes from:

Technical books
Online courses
Conference talks
Research papers

Blog Posts

Technical writing documenting:

Learning progress
Project insights
Performance analyses
Architecture decisions

Timeline

This is a 6-month structured learning plan (November 2024 - May 2025)

Month 1: C++ Foundations & CUDA Basics

Modern C++ review
Basic CUDA programming
Performance profiling

Month 2: Advanced CUDA & Computer Architecture

GPU architecture deep dive
Memory optimization
Cache-friendly algorithms

Month 3: ML Operations & GPU Optimization

Custom ML operators
CUDA kernel optimization
PyTorch C++ integration

Month 4: Distributed Systems & ML Infrastructure

Multi-GPU programming
Distributed training
Network optimization

Month 5: ML Compilation & Advanced Optimization

Kernel fusion
Compilation techniques
Advanced optimization

Month 6: Production ML Systems

Model serving
Production monitoring
System optimization

Resources

Books

"A Tour of C++" by Bjarne Stroustrup
"Effective Modern C++" by Scott Meyers
"Programming Massively Parallel Processors"
"Computer Architecture: A Quantitative Approach"

Courses

Georgia Tech High Performance Computing
NVIDIA CUDA Programming Course
Various ML systems courses

Tools

NVIDIA NSight Systems
CUDA Toolkit
Modern C++ development environment
Performance profiling tools

Progress Tracking

Blog Posts

Contributing

While this is a personal learning repository, I welcome discussions, suggestions, and feedback through issues and discussions.

License

This project is licensed under the MIT License - see the LICENSE file for details.

This is a living document that will be updated as the learning journey progresses.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
blog-posts/technical-deep-dives		blog-posts/technical-deep-dives
cpp-practice/memory-management/unique_resource		cpp-practice/memory-management/unique_resource
docs		docs
mojo/mat-mul		mojo/mat-mul
.gitignore		.gitignore

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU Systems Learning

Overview

Learning Objectives

Project Structure

C++ Practice

CUDA Projects

ML Systems

Study Notes

Blog Posts

Timeline

Resources

Books

Courses

Tools

Progress Tracking

Blog Posts

Contributing

License

About

Releases

Packages

Languages

ixmorrow/gpu-systems-learning

Folders and files

Latest commit

History

Repository files navigation

GPU Systems Learning

Overview

Learning Objectives

Project Structure

C++ Practice

CUDA Projects

ML Systems

Study Notes

Blog Posts

Timeline

Resources

Books

Courses

Tools

Progress Tracking

Blog Posts

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages