diff --git a/docs/contents/frameworks/frameworks.html b/docs/contents/frameworks/frameworks.html
index b46ffc93..359210f9 100644
--- a/docs/contents/frameworks/frameworks.html
+++ b/docs/contents/frameworks/frameworks.html
@@ -1647,7 +1647,7 @@ <h3 data-number="6.8.3" class="anchored" data-anchor-id="library"><span class="h
 <section id="choosing-the-right-framework" class="level2" data-number="6.9">
 <h2 data-number="6.9" class="anchored" data-anchor-id="choosing-the-right-framework"><span class="header-section-number">6.9</span> Choosing the Right Framework</h2>
 <p>Choosing the right machine learning framework for a given application requires carefully evaluating models, hardware, and software considerations. By analyzing these three aspects—models, hardware, and software—ML engineers can select the optimal framework and customize it as needed for efficient and performant on-device ML applications. The goal is to balance model complexity, hardware limitations, and software integration to design a tailored ML pipeline for embedded and edge devices.</p>
-<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - General">
+<div id="fig-tf-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - General" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <img src="images/png/image4.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - General" class="figure-img">
@@ -1663,7 +1663,7 @@ <h3 data-number="6.9.1" class="anchored" data-anchor-id="model"><span class="hea
 </section>
 <section id="software" class="level3" data-number="6.9.2">
 <h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="header-section-number">6.9.2</span> Software</h3>
-<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Model">
+<div id="fig-tf-sw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Model" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-sw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <img src="images/png/image5.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Model" class="figure-img">
@@ -1677,7 +1677,7 @@ <h3 data-number="6.9.2" class="anchored" data-anchor-id="software"><span class="
 </section>
 <section id="hardware" class="level3" data-number="6.9.3">
 <h3 data-number="6.9.3" class="anchored" data-anchor-id="hardware"><span class="header-section-number">6.9.3</span> Hardware</h3>
-<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware">
+<div id="fig-tf-hw-comparison" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="TensorFlow Framework Comparison - Hardware" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-tf-hw-comparison-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <img src="images/png/image3.png" style="width:100.0%" data-align="center" data-caption="TensorFlow Framework Comparison - Hardware" class="figure-img">
@@ -1723,7 +1723,7 @@ <h2 data-number="6.10" class="anchored" data-anchor-id="future-trends-in-ml-fram
 <section id="decomposition" class="level3" data-number="6.10.1">
 <h3 data-number="6.10.1" class="anchored" data-anchor-id="decomposition"><span class="header-section-number">6.10.1</span> Decomposition</h3>
 <p>Currently, the ML system stack consists of four abstractions as shown in <a href="#fig-mlsys-stack" class="quarto-xref">Figure&nbsp;<span>6.11</span></a>, namely (1) computational graphs, (2) tensor programs, (3) libraries and runtimes, and (4) hardware primitives.</p>
-<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-align="center" data-caption="Four Abstractions in Current ML System Stack">
+<div id="fig-mlsys-stack" class="quarto-float quarto-figure quarto-figure-center anchored" data-caption="Four Abstractions in Current ML System Stack" data-align="center">
 <figure class="quarto-float quarto-float-fig figure">
 <div aria-describedby="fig-mlsys-stack-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
 <img src="images/png/image8.png" class="img-fluid figure-img" data-align="center" data-caption="Four Abstractions in Current ML System Stack">
diff --git a/docs/contents/hw_acceleration/hw_acceleration.html b/docs/contents/hw_acceleration/hw_acceleration.html
index 3c51ce4b..6f6d8bc7 100644
--- a/docs/contents/hw_acceleration/hw_acceleration.html
+++ b/docs/contents/hw_acceleration/hw_acceleration.html
@@ -2006,8 +2006,8 @@ <h3 data-number="10.9.3" class="anchored" data-anchor-id="ml-for-efficient-hardw
 <p>A key goal is designing hardware architectures optimized for performance, power, and efficiency. ML introduces new techniques to automate and improve architecture design space exploration for general-purpose and specialized hardware like ML accelerators. Some promising examples include:</p>
 <ul>
 <li><strong>Architecture search for hardware:</strong> Search techniques like evolutionary algorithms <span class="citation" data-cites="kao2020gamma">(<a href="../../references.html#ref-kao2020gamma" role="doc-biblioref">Kao and Krishna 2020</a>)</span>, Bayesian optimization (<span class="citation" data-cites="reagen2017case">Reagen et al. (<a href="../../references.html#ref-reagen2017case" role="doc-biblioref">2017</a>)</span>, <span class="citation" data-cites="bhardwaj2020comprehensive">Bhardwaj et al. (<a href="../../references.html#ref-bhardwaj2020comprehensive" role="doc-biblioref">2020</a>)</span>), reinforcement learning (<span class="citation" data-cites="kao2020confuciux">Kao, Jeong, and Krishna (<a href="../../references.html#ref-kao2020confuciux" role="doc-biblioref">2020</a>)</span>, <span class="citation" data-cites="krishnan2022multiagent">Krishnan et al. (<a href="../../references.html#ref-krishnan2022multiagent" role="doc-biblioref">2022</a>)</span>) can automatically generate novel hardware architectures by mutating and mixing design attributes like cache size, number of parallel units, memory bandwidth, and so on. This allows for efficient navigation of large design spaces.</li>
-<li><strong>Predictive modeling for optimization:</strong> - ML models can be trained to predict hardware performance, power, and efficiency metrics for a given architecture. These become “surrogate models” <span class="citation" data-cites="krishnan2023archgym">(<a href="../../references.html#ref-krishnan2023archgym" role="doc-biblioref">Krishnan et al. 2023</a>)</span> for fast optimization and space exploration by substituting lengthy simulations.</li>
-<li><strong>Specialized accelerator optimization:</strong> - For specialized chips like tensor processing units for AI, automated architecture search techniques based on ML algorithms <span class="citation" data-cites="zhang2022fullstack">(<a href="../../references.html#ref-zhang2022fullstack" role="doc-biblioref">D. Zhang et al. 2022</a>)</span> show promise for finding fast, efficient designs.</li>
+<li><strong>Predictive modeling for optimization:</strong> ML models can be trained to predict hardware performance, power, and efficiency metrics for a given architecture. These become “surrogate models” <span class="citation" data-cites="krishnan2023archgym">(<a href="../../references.html#ref-krishnan2023archgym" role="doc-biblioref">Krishnan et al. 2023</a>)</span> for fast optimization and space exploration by substituting lengthy simulations.</li>
+<li><strong>Specialized accelerator optimization:</strong> For specialized chips like tensor processing units for AI, automated architecture search techniques based on ML algorithms <span class="citation" data-cites="zhang2022fullstack">(<a href="../../references.html#ref-zhang2022fullstack" role="doc-biblioref">D. Zhang et al. 2022</a>)</span> show promise for finding fast, efficient designs.</li>
 </ul>
 <div class="no-row-height column-margin column-container"><div id="ref-kao2020gamma" class="csl-entry" role="listitem">
 Kao, Sheng-Chun, and Tushar Krishna. 2020. <span>“Gamma: Automating the HW Mapping of DNN Models on Accelerators via Genetic Algorithm.”</span> In <em>Proceedings of the 39th International Conference on Computer-Aided Design</em>, 1–9. ACM. <a href="https://doi.org/10.1145/3400302.3415639">https://doi.org/10.1145/3400302.3415639</a>.
@@ -2031,7 +2031,7 @@ <h3 data-number="10.9.4" class="anchored" data-anchor-id="ml-to-optimize-manufac
 <li><strong>Process optimization:</strong> Supervised learning models can be trained on process data to identify factors that lead to low yields. The models can then optimize parameters to improve yields, throughput, or consistency.</li>
 <li><strong>Yield prediction:</strong> By analyzing test data from fabricated designs using techniques like regression trees, ML models can predict yields early in production, allowing process adjustments.</li>
 <li><strong>Defect detection:</strong> Computer vision ML techniques can be applied to images of designs to identify defects invisible to the human eye. This enables precision quality control and root cause analysis.</li>
-<li><strong>Proactive failure analysis:</strong> - ML models can help predict, diagnose, and prevent issues that lead to downstream defects and failures by analyzing structured and unstructured process data.</li>
+<li><strong>Proactive failure analysis:</strong> ML models can help predict, diagnose, and prevent issues that lead to downstream defects and failures by analyzing structured and unstructured process data.</li>
 </ul>
 <p>Applying ML to manufacturing enables process optimization, real-time quality control, predictive maintenance, and higher yields. Challenges include managing complex manufacturing data and variations. But ML is poised to transform semiconductor manufacturing.</p>
 </section>
diff --git a/docs/search.json b/docs/search.json
index 6d8fb401..d737553c 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -1104,7 +1104,7 @@
     "href": "contents/hw_acceleration/hw_acceleration.html#future-trends",
     "title": "10  AI Acceleration",
     "section": "10.9 Future Trends",
-    "text": "10.9 Future Trends\nIn this chapter, the primary focus has been on designing specialized hardware optimized for machine learning workloads and algorithms. This discussion encompassed the tailored architectures of GPUs and TPUs for neural network training and inference. However, an emerging research direction is leveraging machine learning to facilitate the hardware design process itself.\nThe hardware design process involves many complex stages, including specification, high-level modeling, simulation, synthesis, verification, prototyping, and fabrication. Much of this process traditionally requires extensive human expertise, effort, and time. However, recent advances in machine learning are enabling parts of the hardware design workflow to be automated and enhanced using ML techniques.\nSome examples of how ML is transforming hardware design include:\n\nAutomated circuit synthesis using reinforcement learning: Rather than hand-crafting transistor-level designs, ML agents such as reinforcement learning can learn to connect logic gates and generate circuit layouts automatically. This can accelerate the time-consuming synthesis process.\nML-based hardware simulation and emulation: Deep neural network models can be trained to predict how a hardware design will perform under different conditions. For instance, deep learning models can be trained to predict cycle counts for given workloads. This allows faster and more accurate simulation than traditional RTL simulations.\nAutomated chip floorplanning using ML algorithms: Chip floorplanning involves optimally placing different components on a die. Evolutionary algorithms like genetic algorithms and other ML algorithms like reinforcement learning are used to explore floorplan options. This can significantly improve manual floorplanning placements in terms of faster turnaround time and quality of placements.\nML-driven architecture optimization: Novel hardware architectures, like those for efficient ML accelerators, can be automatically generated and optimized by searching the architectural design space. Machine learning algorithms can effectively search large architectural design spaces.\n\nApplying ML to hardware design automation holds enormous promise to make the process faster, cheaper, and more efficient. It opens up design possibilities that would require more than manual design. The use of ML in hardware design is an area of active research and early deployment, and we will study the techniques involved and their transformative potential.\n\n10.9.1 ML for Hardware Design Automation\nA major opportunity for machine learning in hardware design is automating parts of the complex and tedious design workflow. Hardware design automation (HDA) broadly refers to using ML techniques like reinforcement learning, genetic algorithms, and neural networks to automate tasks like synthesis, verification, floorplanning, and more. Here are a few examples of where ML for HDA shows real promise:\n\nAutomated circuit synthesis: Circuit synthesis involves converting a high-level description of desired logic into an optimized gate-level netlist implementation. This complex process has many design considerations and tradeoffs. ML agents can be trained through reinforcement learning G. Zhou and Anderson (2023) to explore the design space and automatically output optimized syntheses. Startups like Symbiotic EDA are bringing this technology to market.\nAutomated chip floorplanning: Floorplanning refers to strategically placing different components on a chip die area. Search algorithms like genetic algorithms (Valenzuela and Wang 2000) and reinforcement learning (Mirhoseini et al. (2021), Agnesina et al. (2023)) can be used to automate floorplan optimization to minimize wire length, power consumption, and other objectives. These automated ML-assisted floor planners are extremely valuable as chip complexity increases.\nML hardware simulators: Training deep neural network models to predict how hardware designs will perform as simulators can accelerate the simulation process by over 100x compared to traditional architectural and RTL simulations.\nAutomated code translation: Converting hardware description languages like Verilog to optimized RTL implementations is critical but time-consuming. ML models can be trained to act as translator agents and automate this process.\n\n\nZhou, Guanglei, and Jason H. Anderson. 2023. “Area-Driven FPGA Logic Synthesis Using Reinforcement Learning.” In Proceedings of the 28th Asia and South Pacific Design Automation Conference, 159–65. ACM. https://doi.org/10.1145/3566097.3567894.\n\nValenzuela, Christine L, and Pearl Y Wang. 2000. “A Genetic Algorithm for VLSI Floorplanning.” In Parallel Problem Solving from Nature PPSN VI: 6th International Conference Paris, France, September 1820, 2000 Proceedings 6, 671–80. Springer.\n\nMirhoseini, Azalia, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, et al. 2021. “A Graph Placement Methodology for Fast Chip Design.” Nature 594 (7862): 207–12. https://doi.org/10.1038/s41586-021-03544-w.\n\nAgnesina, Anthony, Puranjay Rajvanshi, Tian Yang, Geraldo Pradipta, Austin Jiao, Ben Keller, Brucek Khailany, and Haoxing Ren. 2023. “AutoDMP: Automated DREAMPlace-Based Macro Placement.” In Proceedings of the 2023 International Symposium on Physical Design, 149–57. ACM. https://doi.org/10.1145/3569052.3578923.\nThe benefits of HDA using ML are reduced design time, superior optimizations, and exploration of design spaces too complex for manual approaches. This can accelerate hardware development and lead to better designs.\nChallenges include limits of ML generalization, the black-box nature of some techniques, and accuracy tradeoffs. However, research is rapidly advancing to address these issues and make HDA ML solutions robust and reliable for production use. HDA provides a major avenue for ML to transform hardware design.\n\n\n10.9.2 ML-Based Hardware Simulation and Verification\nSimulating and verifying hardware designs is critical before manufacturing to ensure the design behaves as intended. Traditional approaches like register-transfer level (RTL) simulation are complex and time-consuming. ML introduces new opportunities to improve hardware simulation and verification. Some examples include:\n\nSurrogate modeling for simulation: Highly accurate surrogate models of a design can be built using neural networks. These models predict outputs from inputs much faster than RTL simulation, enabling fast design space exploration. Companies like Ansys use this technique.\nML simulators: Large neural network models can be trained on RTL simulations to learn to mimic the functionality of a hardware design. Once trained, the NN model can be a highly efficient simulator for regression testing and other tasks. Graphcore has demonstrated over 100x speedup with this approach.\nFormal verification using ML: Formal verification mathematically proves properties about a design. ML techniques can help generate verification properties and learn to solve the complex formal proofs needed, automating parts of this challenging process. Startups like Cortical.io are bringing formal ML verification solutions to the market.\nBug detection: ML models can be trained to process hardware designs and identify potential issues. This assists human designers in inspecting complex designs and finding bugs. Facebook has shown bug detection models for their server hardware.\n\nThe key benefits of applying ML to simulation and verification are faster design validation turnaround times, more rigorous testing, and reduced human effort. Challenges include verifying ML model correctness and handling corner cases. ML promises to accelerate testing workflows significantly.\n\n\n10.9.3 ML for Efficient Hardware Architectures\nA key goal is designing hardware architectures optimized for performance, power, and efficiency. ML introduces new techniques to automate and improve architecture design space exploration for general-purpose and specialized hardware like ML accelerators. Some promising examples include:\n\nArchitecture search for hardware: Search techniques like evolutionary algorithms (Kao and Krishna 2020), Bayesian optimization (Reagen et al. (2017), Bhardwaj et al. (2020)), reinforcement learning (Kao, Jeong, and Krishna (2020), Krishnan et al. (2022)) can automatically generate novel hardware architectures by mutating and mixing design attributes like cache size, number of parallel units, memory bandwidth, and so on. This allows for efficient navigation of large design spaces.\nPredictive modeling for optimization: - ML models can be trained to predict hardware performance, power, and efficiency metrics for a given architecture. These become “surrogate models” (Krishnan et al. 2023) for fast optimization and space exploration by substituting lengthy simulations.\nSpecialized accelerator optimization: - For specialized chips like tensor processing units for AI, automated architecture search techniques based on ML algorithms (D. Zhang et al. 2022) show promise for finding fast, efficient designs.\n\n\nKao, Sheng-Chun, and Tushar Krishna. 2020. “Gamma: Automating the HW Mapping of DNN Models on Accelerators via Genetic Algorithm.” In Proceedings of the 39th International Conference on Computer-Aided Design, 1–9. ACM. https://doi.org/10.1145/3400302.3415639.\n\nReagen, Brandon, Jose Miguel Hernandez-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu-Yeon Wei, and David Brooks. 2017. “A Case for Efficient Accelerator Design Space Exploration via Bayesian Optimization.” In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), 1–6. IEEE; IEEE. https://doi.org/10.1109/islped.2017.8009208.\n\nBhardwaj, Kshitij, Marton Havasi, Yuan Yao, David M. Brooks, José Miguel Hernández-Lobato, and Gu-Yeon Wei. 2020. “A Comprehensive Methodology to Determine Optimal Coherence Interfaces for Many-Accelerator SoCs.” In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, 145–50. ACM. https://doi.org/10.1145/3370748.3406564.\n\nKao, Sheng-Chun, Geonhwa Jeong, and Tushar Krishna. 2020. “ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators Using Reinforcement Learning.” In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 622–36. IEEE; IEEE. https://doi.org/10.1109/micro50266.2020.00058.\n\nKrishnan, Srivatsan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, and Aleksandra Faust. 2022. “Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration.” https://arxiv.org/abs/2211.16385.\n\nZhang, Dan, Safeen Huda, Ebrahim Songhori, Kartik Prabhu, Quoc Le, Anna Goldie, and Azalia Mirhoseini. 2022. “A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators.” In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 27–42. ASPLOS ’22. New York, NY, USA: ACM. https://doi.org/10.1145/3503222.3507767.\nThe benefits of using ML include superior design space exploration, automated optimization, and reduced manual effort. Challenges include long training times for some techniques and local optima limitations. However, ML for hardware architecture holds great potential for unlocking performance and efficiency gains.\n\n\n10.9.4 ML to Optimize Manufacturing and Reduce Defects\nOnce a hardware design is complete, it moves to manufacturing. However, variability and defects during manufacturing can impact yields and quality. ML techniques are now being applied to improve fabrication processes and reduce defects. Some examples include:\n\nPredictive maintenance: ML models can analyze equipment sensor data over time and identify signals that predict maintenance needs before failure. This enables proactive upkeep, which can be very handy in the costly fabrication process.\nProcess optimization: Supervised learning models can be trained on process data to identify factors that lead to low yields. The models can then optimize parameters to improve yields, throughput, or consistency.\nYield prediction: By analyzing test data from fabricated designs using techniques like regression trees, ML models can predict yields early in production, allowing process adjustments.\nDefect detection: Computer vision ML techniques can be applied to images of designs to identify defects invisible to the human eye. This enables precision quality control and root cause analysis.\nProactive failure analysis: - ML models can help predict, diagnose, and prevent issues that lead to downstream defects and failures by analyzing structured and unstructured process data.\n\nApplying ML to manufacturing enables process optimization, real-time quality control, predictive maintenance, and higher yields. Challenges include managing complex manufacturing data and variations. But ML is poised to transform semiconductor manufacturing.\n\n\n10.9.5 Toward Foundation Models for Hardware Design\nAs we have seen, machine learning is opening up new possibilities across the hardware design workflow, from specification to manufacturing. However, current ML techniques are still narrow in scope and require extensive domain-specific engineering. The long-term vision is the development of general artificial intelligence systems that can be applied with versatility across hardware design tasks.\nTo fully realize this vision, investment, and research are needed to develop foundation models for hardware design. These are unified, general-purpose ML models and architectures that can learn complex hardware design skills with the right training data and objectives.\nRealizing foundation models for end-to-end hardware design will require the following:\n\nAccumulate large, high-quality, labeled datasets across hardware design stages to train foundation models.\nAdvances in multi-modal, multi-task ML techniques to handle the diversity of hardware design data and tasks.\nInterfaces and abstraction layers to connect foundation models to existing design flows and tools.\nDevelopment of simulation environments and benchmarks to train and test foundation models on hardware design capabilities.\nMethods to explain and interpret ML models’ design decisions and optimizations for trust and verification.\nCompilation techniques to optimize foundation models for efficient deployment across hardware platforms.\n\nWhile significant research remains, foundation models represent the most transformative long-term goal for imbuing AI into the hardware design process. Democratizing hardware design via versatile, automated ML systems promises to unlock a new era of optimized, efficient, and innovative chip design. The journey ahead is filled with open challenges and opportunities.\nIf you are interested in ML-aided computer architecture design (Krishnan et al. 2023), we encourage you to read Architecture 2.0.\n\nKrishnan, Srivatsan, Amir Yazdanbakhsh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, et al. 2023. “ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design.” In Proceedings of the 50th Annual International Symposium on Computer Architecture, 1–16. ACM. https://doi.org/10.1145/3579371.3589049.\nAlternatively, you can watch Video 10.3 for more details.\n\n\n\n\n\n\nVideo 10.3: Architecture 2.0",
+    "text": "10.9 Future Trends\nIn this chapter, the primary focus has been on designing specialized hardware optimized for machine learning workloads and algorithms. This discussion encompassed the tailored architectures of GPUs and TPUs for neural network training and inference. However, an emerging research direction is leveraging machine learning to facilitate the hardware design process itself.\nThe hardware design process involves many complex stages, including specification, high-level modeling, simulation, synthesis, verification, prototyping, and fabrication. Much of this process traditionally requires extensive human expertise, effort, and time. However, recent advances in machine learning are enabling parts of the hardware design workflow to be automated and enhanced using ML techniques.\nSome examples of how ML is transforming hardware design include:\n\nAutomated circuit synthesis using reinforcement learning: Rather than hand-crafting transistor-level designs, ML agents such as reinforcement learning can learn to connect logic gates and generate circuit layouts automatically. This can accelerate the time-consuming synthesis process.\nML-based hardware simulation and emulation: Deep neural network models can be trained to predict how a hardware design will perform under different conditions. For instance, deep learning models can be trained to predict cycle counts for given workloads. This allows faster and more accurate simulation than traditional RTL simulations.\nAutomated chip floorplanning using ML algorithms: Chip floorplanning involves optimally placing different components on a die. Evolutionary algorithms like genetic algorithms and other ML algorithms like reinforcement learning are used to explore floorplan options. This can significantly improve manual floorplanning placements in terms of faster turnaround time and quality of placements.\nML-driven architecture optimization: Novel hardware architectures, like those for efficient ML accelerators, can be automatically generated and optimized by searching the architectural design space. Machine learning algorithms can effectively search large architectural design spaces.\n\nApplying ML to hardware design automation holds enormous promise to make the process faster, cheaper, and more efficient. It opens up design possibilities that would require more than manual design. The use of ML in hardware design is an area of active research and early deployment, and we will study the techniques involved and their transformative potential.\n\n10.9.1 ML for Hardware Design Automation\nA major opportunity for machine learning in hardware design is automating parts of the complex and tedious design workflow. Hardware design automation (HDA) broadly refers to using ML techniques like reinforcement learning, genetic algorithms, and neural networks to automate tasks like synthesis, verification, floorplanning, and more. Here are a few examples of where ML for HDA shows real promise:\n\nAutomated circuit synthesis: Circuit synthesis involves converting a high-level description of desired logic into an optimized gate-level netlist implementation. This complex process has many design considerations and tradeoffs. ML agents can be trained through reinforcement learning G. Zhou and Anderson (2023) to explore the design space and automatically output optimized syntheses. Startups like Symbiotic EDA are bringing this technology to market.\nAutomated chip floorplanning: Floorplanning refers to strategically placing different components on a chip die area. Search algorithms like genetic algorithms (Valenzuela and Wang 2000) and reinforcement learning (Mirhoseini et al. (2021), Agnesina et al. (2023)) can be used to automate floorplan optimization to minimize wire length, power consumption, and other objectives. These automated ML-assisted floor planners are extremely valuable as chip complexity increases.\nML hardware simulators: Training deep neural network models to predict how hardware designs will perform as simulators can accelerate the simulation process by over 100x compared to traditional architectural and RTL simulations.\nAutomated code translation: Converting hardware description languages like Verilog to optimized RTL implementations is critical but time-consuming. ML models can be trained to act as translator agents and automate this process.\n\n\nZhou, Guanglei, and Jason H. Anderson. 2023. “Area-Driven FPGA Logic Synthesis Using Reinforcement Learning.” In Proceedings of the 28th Asia and South Pacific Design Automation Conference, 159–65. ACM. https://doi.org/10.1145/3566097.3567894.\n\nValenzuela, Christine L, and Pearl Y Wang. 2000. “A Genetic Algorithm for VLSI Floorplanning.” In Parallel Problem Solving from Nature PPSN VI: 6th International Conference Paris, France, September 1820, 2000 Proceedings 6, 671–80. Springer.\n\nMirhoseini, Azalia, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, et al. 2021. “A Graph Placement Methodology for Fast Chip Design.” Nature 594 (7862): 207–12. https://doi.org/10.1038/s41586-021-03544-w.\n\nAgnesina, Anthony, Puranjay Rajvanshi, Tian Yang, Geraldo Pradipta, Austin Jiao, Ben Keller, Brucek Khailany, and Haoxing Ren. 2023. “AutoDMP: Automated DREAMPlace-Based Macro Placement.” In Proceedings of the 2023 International Symposium on Physical Design, 149–57. ACM. https://doi.org/10.1145/3569052.3578923.\nThe benefits of HDA using ML are reduced design time, superior optimizations, and exploration of design spaces too complex for manual approaches. This can accelerate hardware development and lead to better designs.\nChallenges include limits of ML generalization, the black-box nature of some techniques, and accuracy tradeoffs. However, research is rapidly advancing to address these issues and make HDA ML solutions robust and reliable for production use. HDA provides a major avenue for ML to transform hardware design.\n\n\n10.9.2 ML-Based Hardware Simulation and Verification\nSimulating and verifying hardware designs is critical before manufacturing to ensure the design behaves as intended. Traditional approaches like register-transfer level (RTL) simulation are complex and time-consuming. ML introduces new opportunities to improve hardware simulation and verification. Some examples include:\n\nSurrogate modeling for simulation: Highly accurate surrogate models of a design can be built using neural networks. These models predict outputs from inputs much faster than RTL simulation, enabling fast design space exploration. Companies like Ansys use this technique.\nML simulators: Large neural network models can be trained on RTL simulations to learn to mimic the functionality of a hardware design. Once trained, the NN model can be a highly efficient simulator for regression testing and other tasks. Graphcore has demonstrated over 100x speedup with this approach.\nFormal verification using ML: Formal verification mathematically proves properties about a design. ML techniques can help generate verification properties and learn to solve the complex formal proofs needed, automating parts of this challenging process. Startups like Cortical.io are bringing formal ML verification solutions to the market.\nBug detection: ML models can be trained to process hardware designs and identify potential issues. This assists human designers in inspecting complex designs and finding bugs. Facebook has shown bug detection models for their server hardware.\n\nThe key benefits of applying ML to simulation and verification are faster design validation turnaround times, more rigorous testing, and reduced human effort. Challenges include verifying ML model correctness and handling corner cases. ML promises to accelerate testing workflows significantly.\n\n\n10.9.3 ML for Efficient Hardware Architectures\nA key goal is designing hardware architectures optimized for performance, power, and efficiency. ML introduces new techniques to automate and improve architecture design space exploration for general-purpose and specialized hardware like ML accelerators. Some promising examples include:\n\nArchitecture search for hardware: Search techniques like evolutionary algorithms (Kao and Krishna 2020), Bayesian optimization (Reagen et al. (2017), Bhardwaj et al. (2020)), reinforcement learning (Kao, Jeong, and Krishna (2020), Krishnan et al. (2022)) can automatically generate novel hardware architectures by mutating and mixing design attributes like cache size, number of parallel units, memory bandwidth, and so on. This allows for efficient navigation of large design spaces.\nPredictive modeling for optimization: ML models can be trained to predict hardware performance, power, and efficiency metrics for a given architecture. These become “surrogate models” (Krishnan et al. 2023) for fast optimization and space exploration by substituting lengthy simulations.\nSpecialized accelerator optimization: For specialized chips like tensor processing units for AI, automated architecture search techniques based on ML algorithms (D. Zhang et al. 2022) show promise for finding fast, efficient designs.\n\n\nKao, Sheng-Chun, and Tushar Krishna. 2020. “Gamma: Automating the HW Mapping of DNN Models on Accelerators via Genetic Algorithm.” In Proceedings of the 39th International Conference on Computer-Aided Design, 1–9. ACM. https://doi.org/10.1145/3400302.3415639.\n\nReagen, Brandon, Jose Miguel Hernandez-Lobato, Robert Adolf, Michael Gelbart, Paul Whatmough, Gu-Yeon Wei, and David Brooks. 2017. “A Case for Efficient Accelerator Design Space Exploration via Bayesian Optimization.” In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), 1–6. IEEE; IEEE. https://doi.org/10.1109/islped.2017.8009208.\n\nBhardwaj, Kshitij, Marton Havasi, Yuan Yao, David M. Brooks, José Miguel Hernández-Lobato, and Gu-Yeon Wei. 2020. “A Comprehensive Methodology to Determine Optimal Coherence Interfaces for Many-Accelerator SoCs.” In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, 145–50. ACM. https://doi.org/10.1145/3370748.3406564.\n\nKao, Sheng-Chun, Geonhwa Jeong, and Tushar Krishna. 2020. “ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators Using Reinforcement Learning.” In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 622–36. IEEE; IEEE. https://doi.org/10.1109/micro50266.2020.00058.\n\nKrishnan, Srivatsan, Natasha Jaques, Shayegan Omidshafiei, Dan Zhang, Izzeddin Gur, Vijay Janapa Reddi, and Aleksandra Faust. 2022. “Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration.” https://arxiv.org/abs/2211.16385.\n\nZhang, Dan, Safeen Huda, Ebrahim Songhori, Kartik Prabhu, Quoc Le, Anna Goldie, and Azalia Mirhoseini. 2022. “A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators.” In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 27–42. ASPLOS ’22. New York, NY, USA: ACM. https://doi.org/10.1145/3503222.3507767.\nThe benefits of using ML include superior design space exploration, automated optimization, and reduced manual effort. Challenges include long training times for some techniques and local optima limitations. However, ML for hardware architecture holds great potential for unlocking performance and efficiency gains.\n\n\n10.9.4 ML to Optimize Manufacturing and Reduce Defects\nOnce a hardware design is complete, it moves to manufacturing. However, variability and defects during manufacturing can impact yields and quality. ML techniques are now being applied to improve fabrication processes and reduce defects. Some examples include:\n\nPredictive maintenance: ML models can analyze equipment sensor data over time and identify signals that predict maintenance needs before failure. This enables proactive upkeep, which can be very handy in the costly fabrication process.\nProcess optimization: Supervised learning models can be trained on process data to identify factors that lead to low yields. The models can then optimize parameters to improve yields, throughput, or consistency.\nYield prediction: By analyzing test data from fabricated designs using techniques like regression trees, ML models can predict yields early in production, allowing process adjustments.\nDefect detection: Computer vision ML techniques can be applied to images of designs to identify defects invisible to the human eye. This enables precision quality control and root cause analysis.\nProactive failure analysis: ML models can help predict, diagnose, and prevent issues that lead to downstream defects and failures by analyzing structured and unstructured process data.\n\nApplying ML to manufacturing enables process optimization, real-time quality control, predictive maintenance, and higher yields. Challenges include managing complex manufacturing data and variations. But ML is poised to transform semiconductor manufacturing.\n\n\n10.9.5 Toward Foundation Models for Hardware Design\nAs we have seen, machine learning is opening up new possibilities across the hardware design workflow, from specification to manufacturing. However, current ML techniques are still narrow in scope and require extensive domain-specific engineering. The long-term vision is the development of general artificial intelligence systems that can be applied with versatility across hardware design tasks.\nTo fully realize this vision, investment, and research are needed to develop foundation models for hardware design. These are unified, general-purpose ML models and architectures that can learn complex hardware design skills with the right training data and objectives.\nRealizing foundation models for end-to-end hardware design will require the following:\n\nAccumulate large, high-quality, labeled datasets across hardware design stages to train foundation models.\nAdvances in multi-modal, multi-task ML techniques to handle the diversity of hardware design data and tasks.\nInterfaces and abstraction layers to connect foundation models to existing design flows and tools.\nDevelopment of simulation environments and benchmarks to train and test foundation models on hardware design capabilities.\nMethods to explain and interpret ML models’ design decisions and optimizations for trust and verification.\nCompilation techniques to optimize foundation models for efficient deployment across hardware platforms.\n\nWhile significant research remains, foundation models represent the most transformative long-term goal for imbuing AI into the hardware design process. Democratizing hardware design via versatile, automated ML systems promises to unlock a new era of optimized, efficient, and innovative chip design. The journey ahead is filled with open challenges and opportunities.\nIf you are interested in ML-aided computer architecture design (Krishnan et al. 2023), we encourage you to read Architecture 2.0.\n\nKrishnan, Srivatsan, Amir Yazdanbakhsh, Shvetank Prakash, Jason Jabbour, Ikechukwu Uchendu, Susobhan Ghosh, Behzad Boroujerdian, et al. 2023. “ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design.” In Proceedings of the 50th Annual International Symposium on Computer Architecture, 1–16. ACM. https://doi.org/10.1145/3579371.3589049.\nAlternatively, you can watch Video 10.3 for more details.\n\n\n\n\n\n\nVideo 10.3: Architecture 2.0",
     "crumbs": [
       "Training",
       "<span class='chapter-number'>10</span>  <span class='chapter-title'>AI Acceleration</span>"