Skip to content

Commit

Permalink
Addressing Aditi's feedback.
Browse files Browse the repository at this point in the history
  • Loading branch information
profvjreddi committed Nov 9, 2023
1 parent c698b22 commit 7cbb7bf
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions hw_acceleration.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -660,9 +660,11 @@ Simulation software is important in hardware-software co-design. It enables join

- **Co-simulation:** Unified platforms like the SCALE-Sim [@samajdar2018scale] integrate hardware and software simulation into a single tool. This enables what-if analysis to quantify the system-level impacts of cross-layer optimizations early in the design cycle.

For example, an FPGA-based AI accelerator design could be simulated using Verilog hardware description language and synthesized into a Gem5 model. The accelerator could have ML workloads simulated using TVM compiled onto it within the Gem5 environment for unified modeling.
For example, an FPGA-based AI accelerator design could be simulated using Verilog hardware description language and synthesized into a Gem5 model. Verilog is well-suited for describing the digital logic and interconnects that make up the accelerator architecture. Using Verilog allows the designer to specify the datapaths, control logic, on-chip memories, and other components that will be implemented in the FPGA fabric. Once the Verilog design is complete, it can be synthesized into a model that simulates the behavior of the hardware, such as using the Gem5 simulator. Gem5 is useful for this task because it allows modeling of full systems including processors, caches, buses, and custom accelerators. Gem5 supports interfacing Verilog models of hardware to the simulation, enabling unified system modeling.

Co-simulation provides estimations of overall metrics like throughput, latency, and power to guide co-design before expensive physical prototyping. They also assist with partitioning optimizations between hardware and software to guide design tradeoffs.
The synthesized FPGA accelerator model could then have ML workloads simulated using TVM compiled onto it within the Gem5 environment for unified modeling. TVM allows optimized compilation of ML models onto heterogeneous hardware like FPGAs. Running TVM-compiled workloads on the accelerator within the Gem5 simulation provides an integrated way to validate and refine the hardware design, software stack, and system integration before ever needing to physically realize the accelerator on a real FPGA.

This type of co-simulation provides estimations of overall metrics like throughput, latency, and power to guide co-design before expensive physical prototyping. They also assist with partitioning optimizations between hardware and software to guide design tradeoffs.

However, limitations exist in accurately modeling subtle low-level interactions between components. Quantified simulations are an estimate but cannot wholly replace physical prototypes and testing. Still, unified simulation and modeling provides invaluable early insights into system-level optimization opportunities during the co-deign process.

Expand Down Expand Up @@ -782,7 +784,7 @@ As AI workloads have grown, there is increasing demand for tighter integration b

In response, new manufacturing techniques like wafer-scale fabrication and advanced packaging now allow much higher levels of integration. The goal is to create unified, specialized AI compute complexes tailored for deep learning and other AI algorithms. Tighter integration is key to delivering the performance and efficiency needed for the next generation of AI.

#### Wafter-scale AI
#### Wafer-scale AI

Wafer-scale AI takes an extremely integrated approach, manufacturing an entire silicon wafer as one gigantic chip. This differs drastically from conventional CPUs and GPUs which cut each wafer into many smaller individual chips. While some GPUs may contain billions of transistors, they still pale in comparison to the scale of a wafer-size chip with over a trillion transistors.

Expand Down

0 comments on commit 7cbb7bf

Please sign in to comment.