Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated ml_systems #525

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 15 additions & 12 deletions contents/core/ml_systems/ml_systems.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ bibliography: ml_systems.bib
Resources: [Slides](#sec-ml-systems-resource), [Videos](#sec-ml-systems-resource), [Exercises](#sec-ml-systems-resource)
:::

![_DALL·E 3 Prompt: Illustration in a rectangular format depicting the merger of embedded systems with Embedded AI. The left half of the image portrays traditional embedded systems, including microcontrollers and processors, detailed and precise. The right half showcases the world of artificial intelligence, with abstract representations of machine learning models, neurons, and data flow. The two halves are distinctly separated, emphasizing the individual significance of embedded tech and AI, but they come together in harmony at the center._](images/png/cover_ml_systems.png)
![_DALL·E 3 Prompt: Illustration in a rectangular format depicting the merger of embedded systems with Embedded AI. The left half of the image portrays traditional embedded systems, including microcontrollers and processors, in detail and precision. The right half showcases the world of artificial intelligence, with abstract representations of machine learning models, neurons, and data flow. The two halves are distinctly separated, emphasizing the individual significance of embedded tech and AI, but they come together in harmony at the center._](images/png/cover_ml_systems.png)

Machine learning (ML) systems, built on the foundation of computing systems, hold the potential to transform our world. These systems, with their specialized roles and real-time computational capabilities, represent a critical junction where data and computation meet on a micro-scale. They are specifically tailored to optimize performance, energy usage, and spatial efficiency—key factors essential for the successful implementation of ML systems.
Machine learning (ML) systems have emerged as a transformative force, revolutionizing industries and pushing the boundaries of technological innovation. Built upon advanced algorithms and sophisticated computational architectures, these systems are designed to extract actionable insights from complex datasets and enable intelligent, data-driven decision-making. ML systems drive advancements across diverse domains, including healthcare, finance, autonomous systems, and the Internet of Things (IoT), optimizing performance, enhancing energy efficiency, and maximizing spatial utilization. Their versatility and adaptability underscore their pivotal role in addressing the demands of modern applications, positioning ML systems as essential tools for shaping the future of technology.

As this chapter progresses, we will explore ML systems' complex and fascinating world. We'll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide that explains the nuanced aspects of different ML systems.
This chapter delves into the intricate and multifaceted world of ML systems, offering readers a comprehensive exploration of their design principles, operational frameworks, and practical applications. It examines their structural configurations, functional enhancements, and integration techniques to demonstrate how ML systems facilitate real-time computation and intelligent automation. Beginning with microcontroller units (MCUs) fundamentals, the chapter explores their interfaces, peripherals, and optimization strategies to enhance performance and scalability. Through this detailed analysis, the chapter aims to provide readers with the technical insights necessary to develop, optimize, and deploy ML systems effectively across a wide range of real-world applications, from healthcare to finance to autonomous systems.

::: {.callout-tip}

Expand All @@ -32,31 +32,34 @@ As this chapter progresses, we will explore ML systems' complex and fascinating

## Overview

ML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments.
Machine learning (ML), a core subset of artificial intelligence (AI), is rapidly evolving and revolutionizing how we develop, train, and deploy models to solve complex problems. Significant innovations in hardware, software, and algorithmic techniques drive this evolution, enabling ML systems to function in diverse environments—from expansive cloud infrastructures to edge devices and ultra-low-power microcontrollers. These advancements are reshaping industries by making ML more accessible, efficient, and capable of addressing real-world challenges.

Modern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints.
Modern ML systems are deployed across a continuum of options, each characterized by unique strengths, limitations, and trade-offs. At one end of the spectrum, Cloud ML is defined as using powerful, centralized computing resources to handle computationally demanding, data-intensive tasks. As the spectrum progresses, Edge ML is distinguished by its proximity to the data source, which reduces latency and enhances privacy. Finally, TinyML is recognized as the most resource-constrained paradigm, enabling ML models to function on devices with minimal power, memory, and processing capacity.

This chapter explores the landscape of contemporary machine learning systems, covering three key approaches: Cloud ML, Edge ML, and TinyML. @fig-cloud-edge-tinyml-comparison illustrates the spectrum of distributed intelligence across these approaches, providing a visual comparison of their characteristics. We will examine the unique characteristics, advantages, and challenges of each approach, as depicted in the figure. Additionally, we will discuss the emerging trends and technologies that are shaping the future of machine learning deployment, considering how they might influence the balance between these three paradigms.
This chapter is dedicated to an in-depth exploration of the intricate landscape of ML systems, focusing on three paradigms: Cloud ML, Edge ML, and TinyML. As illustrated in @fig-cloud-edge-tinyml-comparison, these paradigms are positioned along a spectrum of distributed intelligence, balancing computational requirements, connectivity, energy efficiency, and cost. Their defining characteristics, benefits, and challenges are examined in detail, alongside a discussion of emerging trends and technologies shaping ML deployment's future.

![Cloud vs. Edge vs. TinyML: The Spectrum of Distributed Intelligence. Source: ABI Research -- TinyML.](images/png/cloud-edge-tiny.png){#fig-cloud-edge-tinyml-comparison}

The evolution of machine learning systems can be seen as a progression from centralized to distributed computing paradigms:

1. **Cloud ML:** Initially, ML was predominantly cloud-based. Powerful servers in data centers were used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power and is ideal for applications where real-time responsiveness isn't critical.
1. **Cloud ML:** Initially, ML relied heavily on centralized servers in large data centers to train and execute complex models. Cloud ML leverages powerful GPUs, TPUs, and extensive storage infrastructure to process massive datasets and computationally demanding tasks. This paradigm is well-suited for applications requiring substantial processing power but not immediate responsiveness, such as large-scale image recognition, natural language processing, and recommendation systems used by platforms like Amazon and Netflix.

2. **Edge ML:** As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as smartphones, smart cameras, or IoT gateways. Edge ML reduces latency, enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It's particularly useful for applications requiring quick responses or handling sensitive data.

3. **TinyML:** The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud or edge, opening up new possibilities for intelligent, battery-operated devices. This approach is crucial for applications where size, power consumption, and cost are critical factors.
2. **Edge ML:** As the demand for real-time processing and reduced latency increased, Edge ML emerged as a significant approach in machine learning. This method brings computation closer to the data source, such as smartphones, IoT devices, and industrial sensors. By reducing reliance on cloud connectivity, Edge ML offers enhanced privacy, optimized bandwidth usage, and robust support for latency-sensitive applications. Notable examples include real-time object detection in autonomous vehicles, predictive maintenance systems in smart factories, and personalized health monitoring enabled by wearable devices.

Each of these paradigms has its own strengths and is suited to different use cases:

3. **TinyML:** The latest advancement in this spectrum, TinyML, brings machine learning to ultra-low-power devices like microcontrollers. TinyML enables on-device inference without constant connectivity to cloud or edge resources, making it ideal for intelligent, battery-operated devices in resource-constrained environments. This paradigm is critical for applications prioritizing size, power efficiency, and cost, such as wearable health monitors, precision agriculture sensors, and smart home devices.

Each of these paradigms has its strengths and is suited to different use cases:

- Cloud ML remains essential for tasks requiring massive computational power or large-scale data analysis.
- Edge ML is ideal for applications needing low-latency responses or local data processing.
- TinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains.

The progression from Cloud to Edge to TinyML reflects a broader trend in computing towards more distributed, localized processing. This evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity.
The shift from Cloud ML to Edge ML and TinyML underscores a growing trend toward distributed and localized computing. The need for faster response times, enhanced privacy, reduced bandwidth usage, and resilience in connectivity-limited environments drive this progression. It also highlights the increasing importance of specialized hardware, such as GPUs and TPUs for Cloud ML, edge-specific AI processors for Edge ML, and highly optimized microcontrollers for TinyML.

As illustrated in @fig-vMLsizes, this shift comes with significant trade-offs in terms of hardware capabilities, latency, connectivity, power consumption, and model complexity. For instance, deploying deep learning models on microcontrollers—the backbone of TinyML—requires innovative techniques to address memory and processing capacity constraints. Common methods include model compression strategies, such as pruning and quantization, which reduce model size while preserving performance. Knowledge distillation and sparse representations can enhance efficiency by simplifying complex models for resource-constrained environments. Hardware-specific optimizations, such as leveraging microcontroller architectures, further enable feasible deployment under stringent conditions.

@fig-vMLsizes illustrates the key differences between Cloud ML, Edge ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter.

![From cloud GPUs to microcontrollers: Navigating the memory and storage landscape across computing devices. Source: [@lin2023tiny]](./images/jpg/cloud_mobile_tiny_sizes.jpg){#fig-vMLsizes}

Expand Down