Skip to content

Commit

Permalink
added overview paragraph text for the chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
profvjreddi committed Nov 9, 2023
1 parent dc1bb18 commit c698b22
Showing 1 changed file with 17 additions and 1 deletion.
18 changes: 17 additions & 1 deletion hw_acceleration.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,26 @@

![_DALL·E 3 Prompt: Create an intricate and colorful representation of a System on Chip (SoC) design in a rectangular format. Showcase a variety of specialized machine learning accelerators and chiplets, all integrated into the processor. Provide a detailed view inside the chip, highlighting the rapid movement of electrons. Each accelerator and chiplet should be designed to interact with neural network neurons, layers, and activations, emphasizing their processing speed. Depict the neural networks as a network of interconnected nodes, with vibrant data streams flowing between the accelerator pieces, showcasing the enhanced computation speed._](./images/cover_ai_hardware.png)

Machine learning has emerged as a transformative technology across many industries. However, deploying ML capabilities in real-world edge devices faces challenges due to limited computing resources. Specialized hardware acceleration has become essential to enable high-performance machine learning under these constraints. Hardware accelerators optimize compute-intensive operations like inference using custom silicon optimized for matrix multiplications. This provides dramatic speedups over general-purpose CPUs, unlocking real-time execution of advanced models on size, weight and power-constrained devices.

This chapter provides essential background on hardware acceleration techniques for embedded machine learning and their tradeoffs. The goal is to equip readers to make informed hardware selections and software optimizations to develop performant on-device ML capabilities.

::: {.callout-tip}
## Learning Objectives

* coming soon.
* Understand why hardware acceleration is needed for AI workloads

* Survey key accelerator options like GPUs, TPUs, FPGAs, and ASICs and their tradeoffs

* Learn about programming models, frameworks, compilers for AI accelerators

* Appreciate the importance of benchmarking and metrics for hardware evaluation

* Recognize the role of hardware-software co-design in building efficient systems

* Gain exposure to cutting-edge research directions like neuromorphics and quantum computing

* Understand how ML is beginning to augment and enhance hardware design

:::

Expand Down

0 comments on commit c698b22

Please sign in to comment.