Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

554 Rewrite chapter 3 with a ML systems focus #557

Merged
merged 30 commits into from
Dec 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
d44a291
Started drafting the new chapter 3
profvjreddi Dec 17, 2024
4bc4846
Made first pass through up to main NN section
profvjreddi Dec 17, 2024
37e8fe2
Working on text up to Learning Process
profvjreddi Dec 19, 2024
b7af847
Putting in notes about the training process
profvjreddi Dec 19, 2024
c53db30
Experimenting with the modern DNN section, still got todos left in [ ]
profvjreddi Dec 20, 2024
ccef4d4
Adding network details
profvjreddi Dec 20, 2024
73c6a2f
Reorganizing the file structure
profvjreddi Dec 21, 2024
39ce287
bib name fix
profvjreddi Dec 21, 2024
cff20d5
bib name fix
profvjreddi Dec 21, 2024
6c351b1
Adjusting the intro to match the rest of the flow
profvjreddi Dec 21, 2024
70339b6
Fixed structure, content and conclusion now that it is standalone chap
profvjreddi Dec 21, 2024
45448d0
Some minor text improvements
profvjreddi Dec 21, 2024
d5b5c34
Working on inference draft for NN primer
profvjreddi Dec 21, 2024
36f0362
Working on NN inference primer
profvjreddi Dec 21, 2024
eb04d5a
Update banner message
profvjreddi Dec 22, 2024
3ca7665
Adding in Colabs and fixing learning objectives
profvjreddi Dec 22, 2024
b35a856
Add in 3Blue1Brown videos
profvjreddi Dec 22, 2024
4595874
Update banner
profvjreddi Dec 22, 2024
e1e2f31
Build fix
profvjreddi Dec 22, 2024
2155158
wording tweaks
profvjreddi Dec 22, 2024
9b37421
Updated cover image since the contents have changed
profvjreddi Dec 22, 2024
d603683
Updated cover image
profvjreddi Dec 22, 2024
1b9d09f
updated purpose of chapter
profvjreddi Dec 23, 2024
2187233
Renamed to ANN primer
profvjreddi Dec 23, 2024
ca82680
Remove learning objectives fix comment
profvjreddi Dec 23, 2024
b2c0f22
Enable all file build
profvjreddi Dec 23, 2024
44851a0
Merge branch 'dev' into 554-rewrite-chapter-3-with-a-ml-systems-focus
profvjreddi Dec 23, 2024
17b7bb4
Merge branch 'dev' into 554-rewrite-chapter-3-with-a-ml-systems-focus
profvjreddi Dec 23, 2024
3e23076
fixing cross referencing links
profvjreddi Dec 23, 2024
f0d19a1
Going back to dl_primer folder structure/name
profvjreddi Dec 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ website:
icon: star-half
dismissable: true
content: |
⭐ [Oct 18] <b>We Hit 1,000 GitHub Stars</b> 🎉 Thanks to you, Arduino and SEEED donated AI hardware kits for <a href="https://tinyml.seas.harvard.edu/4D/pastEvents">TinyML workshops</a> in developing nations! </br>
🎓 [Nov 15] The [EDGE AI Foundation](https://www.edgeaifoundation.org/) is **matching academic scholarship funds** for every new GitHub ⭐ (up to 10,000 stars). <a href="https://github.com/harvard-edge/cs249r_book">Click here to show support!</a> 🙏 </br>
🚀 <b>Our mission. 1 ⭐ = 1 👩‍🎓 Learner</b>. Every star tells a story: learners gaining knowledge and supporters driving the mission. Together, we're making a difference.
🚀 <b>Our mission. 1 ⭐ = 1 👩‍🎓 Learner</b>. Every star tells a story: learners gaining knowledge and supporters fueling our mission. Together, we're making a difference. Thank you for your support and happy holidays!
🎓 [Nov 15] The <a href="https://www.edgeaifoundation.org/">EDGE AI Foundation</a> is <b>matching academic scholarship funds</b> for every new GitHub ⭐ (up to 10,000 stars). <a href="https://github.com/harvard-edge/cs249r_book">Click here to show support!</a> 🙏
📘 [Dec 22] <b>Chapter 3 updated!</b> New revisions include expanded content and improved explanations. Check it out <a href="https://mlsysbook.ai/contents/core/dl_primer/dl_primer.html">here</a>. 🌟

position: below-navbar

Expand Down Expand Up @@ -117,6 +117,7 @@ book:
- contents/core/introduction/introduction.qmd
- contents/core/ml_systems/ml_systems.qmd
- contents/core/dl_primer/dl_primer.qmd
# - contents/core/dl_architectures/dl_architectures.qmd
- contents/core/workflow/workflow.qmd
- contents/core/data_engineering/data_engineering.qmd
- contents/core/frameworks/frameworks.qmd
Expand Down Expand Up @@ -242,7 +243,7 @@ format:
- style-dark.scss

code-block-bg: true
code-block-border-left: "#A51C30"
#code-block-border-left: "#A51C30"

table:
classes: [table-striped, table-hover]
Expand Down
Empty file.
748 changes: 748 additions & 0 deletions contents/core/dl_architectures/dl_architectures.qmd

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions contents/core/dl_primer/dl_primer.bib
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,27 @@ @article{goodfellow2020generative
month = oct,
}

@article{vaswani2017attention,
title={Attention is all you need},
author={Vaswani, A},
journal={Advances in Neural Information Processing Systems},
year={2017}
}

@book{reagen2017deep,
title={Deep learning for computer architects},
author={Reagen, Brandon and Adolf, Robert and Whatmough, Paul and Wei, Gu-Yeon and Brooks, David and Martonosi, Margaret},
year={2017},
publisher={Springer}
}

@article{bahdanau2014neural,
title={Neural machine translation by jointly learning to align and translate},
author={Bahdanau, Dzmitry},
journal={arXiv preprint arXiv:1409.0473},
year={2014}
}

@inproceedings{jouppi2017datacenter,
author = {Jouppi, Norman P. and Young, Cliff and Patil, Nishant and Patterson, David and Agrawal, Gaurav and Bajwa, Raminder and Bates, Sarah and Bhatia, Suresh and Boden, Nan and Borchers, Al and Boyle, Rick and Cantin, Pierre-luc and Chao, Clifford and Clark, Chris and Coriell, Jeremy and Daley, Mike and Dau, Matt and Dean, Jeffrey and Gelb, Ben and Ghaemmaghami, Tara Vazir and Gottipati, Rajendra and Gulland, William and Hagmann, Robert and Ho, C. Richard and Hogberg, Doug and Hu, John and Hundt, Robert and Hurt, Dan and Ibarz, Julian and Jaffey, Aaron and Jaworski, Alek and Kaplan, Alexander and Khaitan, Harshit and Killebrew, Daniel and Koch, Andy and Kumar, Naveen and Lacy, Steve and Laudon, James and Law, James and Le, Diemthu and Leary, Chris and Liu, Zhuyuan and Lucke, Kyle and Lundin, Alan and MacKean, Gordon and Maggiore, Adriana and Mahony, Maire and Miller, Kieran and Nagarajan, Rahul and Narayanaswami, Ravi and Ni, Ray and Nix, Kathy and Norrie, Thomas and Omernick, Mark and Penukonda, Narayana and Phelps, Andy and Ross, Jonathan and Ross, Matt and Salek, Amir and Samadiani, Emad and Severn, Chris and Sizikov, Gregory and Snelham, Matthew and Souter, Jed and Steinberg, Dan and Swing, Andy and Tan, Mercedes and Thorson, Gregory and Tian, Bo and Toma, Horia and Tuttle, Erick and Vasudevan, Vijay and Walter, Richard and Wang, Walter and Wilcox, Eric and Yoon, Doe Hyun},
abstract = {Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC{\textemdash}called a Tensor Processing Unit (TPU) {\textemdash} deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95\% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X {\textendash} 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X {\textendash} 80X higher. Moreover, using the CPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.},
Expand Down
Loading
Loading