From d17e8ef1679ccb808d241ca13fabaa1d62d92c94 Mon Sep 17 00:00:00 2001
From: Marcelo Rovai <Mjrovai@gmail.com>
Date: Fri, 6 Oct 2023 15:07:35 -0300
Subject: [PATCH 1/2] Add files via upload

---
 embedded_ml_exercise.qmd | 742 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 742 insertions(+)
 create mode 100644 embedded_ml_exercise.qmd

diff --git a/embedded_ml_exercise.qmd b/embedded_ml_exercise.qmd
new file mode 100644
index 00000000..8b9a0f8a
--- /dev/null
+++ b/embedded_ml_exercise.qmd
@@ -0,0 +1,742 @@
+---
+title: "4 Embedded AI - Exercise: Image Classification"
+format:
+  html:
+    code-fold: false
+execute:
+  eval: false
+jupyter: python3
+---
+
+
+### **Introduction**
+
+As we initiate our studies into embedded machine learning or tinyML,
+it\'s impossible to overlook the transformative impact of Computer
+Vision (CV) and Artificial Intelligence (AI) in our lives. These two
+intertwined disciplines redefine what machines can perceive and
+accomplish, from autonomous vehicles and robotics to healthcare and
+surveillance.
+
+More and more, we are facing an artificial intelligence (AI) revolution
+where, as stated by Gartner, **Edge AI** has a very high impact
+potential, and **it is for now**!
+
+![](images_4/media/image2.jpg){width="4.729166666666667in"
+height="4.895833333333333in"}
+
+In the \"bull-eye\" of emerging technologies, radar is the *Edge
+Computer Vision*, and when we talk about Machine Learning (ML) applied
+to vision, the first thing that comes to mind is **Image
+Classification**, a kind of ML \"Hello World\"!
+
+This exercise will explore a computer vision project utilizing
+Convolutional Neural Networks (CNNs) for real-time image classification.
+Leveraging TensorFlow\'s robust ecosystem, we\'ll implement a
+pre-trained MobileNet model and adapt it for edge deployment. The focus
+will be optimizing the model to run efficiently on resource-constrained
+hardware without sacrificing accuracy.
+
+We\'ll employ techniques like quantization and pruning to reduce the
+computational load. By the end of this tutorial, you\'ll have a working
+prototype capable of classifying images in real time, all running on a
+low-power embedded system based on the Arduino Nicla Vision board.
+
+### **Computer Vision**
+
+At its core, computer vision aims to enable machines to interpret and
+make decisions based on visual data from the world---essentially
+mimicking the capability of the human optical system. Conversely, AI is
+a broader field encompassing machine learning, natural language
+processing, and robotics, among other technologies. When you bring AI
+algorithms into computer vision projects, you supercharge the system\'s
+ability to understand, interpret, and react to visual stimuli.
+
+When discussing Computer Vision projects applied to embedded devices,
+the most common applications that come to mind are *Image
+Classification* and *Object Detection*.
+
+![](images_4/media/image15.jpg){width="6.5in"
+height="2.8333333333333335in"}
+
+Both models can be implemented on tiny devices like the Arduino Nicla
+Vision and used on real projects. Let\'s start with the first one.
+
+### **Image Classification Project**
+
+The first step in any ML project is to define our goal. In this case, it
+is to detect and classify two specific objects present in one image. For
+this project, we will use two small toys: a *robot* and a small
+Brazilian parrot (named *Periquito*). Also, we will collect images of a
+*background* where those two objects are absent.
+
+![](images_4/media/image36.jpg){width="6.5in"
+height="3.638888888888889in"}
+
+### **Data Collection**
+
+Once you have defined your Machine Learning project goal, the next and
+most crucial step is the dataset collection. You can use the Edge
+Impulse Studio, the OpenMV IDE we installed, or even your phone for the
+image capture. Here, we will use the OpenMV IDE for that.
+
+**Collecting Dataset with OpenMV IDE**
+
+First, create in your computer a folder where your data will be saved,
+for example, \"data.\" Next, on the OpenMV IDE, go to Tools \> Dataset
+Editor and select New Dataset to start the dataset collection:
+
+![](images_4/media/image29.png){width="6.291666666666667in"
+height="4.010416666666667in"}
+
+The IDE will ask you to open the file where your data will be saved and
+choose the \"data\" folder that was created. Note that new icons will
+appear on the Left panel.
+
+![](images_4/media/image46.png){width="0.9583333333333334in"
+height="1.5208333333333333in"}
+
+Using the upper icon (1), enter with the first class name, for example,
+\"periquito\":
+
+![](images_4/media/image22.png){width="3.25in"
+height="2.65625in"}
+
+Run the dataset_capture_script.py, and clicking on the bottom icon (2),
+will start capturing images:
+
+![](images_4/media/image43.png){width="6.5in"
+height="4.041666666666667in"}
+
+Repeat the same procedure with the other classes
+
+![](images_4/media/image6.jpg){width="6.5in"
+height="3.0972222222222223in"}
+
+> *We suggest around 60 images from each category. Try to capture
+> different angles, backgrounds, and light conditions.*
+
+The stored images use a QVGA frame size 320x240 and RGB565 (color pixel
+format).
+
+After capturing your dataset, close the Dataset Editor Tool on the Tools
+\> Dataset Editor.
+
+On your computer, you will end with a dataset that contains three
+classes: periquito, robot, and background.
+
+![](images_4/media/image20.png){width="6.5in"
+height="2.2083333333333335in"}
+
+You should return to Edge Impulse Studio and upload the dataset to your
+project.
+
+### **Training the model with Edge Impulse Studio**
+
+We will use the Edge Impulse Studio for training our model. Enter your
+account credentials at Edge Impulse and create a new project:
+
+![](images_4/media/image45.png){width="6.5in"
+height="4.263888888888889in"}
+
+> *Here, you can clone a similar project:*
+> *[NICLA-Vision_Image_Classification](https://studio.edgeimpulse.com/public/273858/latest).*
+
+### **Dataset**
+
+Using the EI Studio (or *Studio*), we will pass over four main steps to
+have our model ready for use on the Nicla Vision board: Dataset,
+Impulse, Tests, and Deploy (on the Edge Device, in this case, the
+NiclaV).
+
+![](images_4/media/image41.jpg){width="6.5in"
+height="4.194444444444445in"}
+
+Regarding the Dataset, it is essential to point out that our Original
+Dataset, captured with the OpenMV IDE, will be split into three parts:
+Training, Validation, and Test. The Test Set will be divided from the
+beginning and left a part to be used only in the Test phase after
+training. The Validation Set will be used during training.
+
+![](images_4/media/image7.jpg){width="6.5in"
+height="4.763888888888889in"}
+
+On Studio, go to the Data acquisition tab, and on the UPLOAD DATA
+section, upload from your computer the files from chosen categories:
+
+![](images_4/media/image39.png){width="6.5in"
+height="4.263888888888889in"}
+
+Left to the Studio to automatically split the original dataset into
+training and test and choose the label related to that specific data:
+
+![](images_4/media/image30.png){width="6.5in"
+height="4.263888888888889in"}
+
+Repeat the procedure for all three classes. At the end, you should see
+your \"raw data in the Studio:
+
+![](images_4/media/image11.png){width="6.5in"
+height="4.263888888888889in"}
+
+The Studio allows you to explore your data, showing a complete view of
+all the data in your project. You can clear, inspect, or change labels
+by clicking on individual data items. In our case, a simple project, the
+data seems OK.
+
+![](images_4/media/image44.png){width="6.5in"
+height="4.263888888888889in"}
+
+### **The Impulse Design**
+
+In this phase, we should define how to:
+
+-   Pre-process our data, which consists of resizing the individual
+    > images and determining the color depth to use (RGB or Grayscale)
+    > and
+
+-   Design a Model that will be \"Transfer Learning (Images)\" to
+    > fine-tune a pre-trained MobileNet V2 image classification model on
+    > our data. This method performs well even with relatively small
+    > image datasets (around 150 images in our case).
+
+![](images_4/media/image23.jpg){width="6.5in"
+height="4.0in"}
+
+Transfer Learning with MobileNet offers a streamlined approach to model
+training, which is especially beneficial for resource-constrained
+environments and projects with limited labeled data. MobileNet, known
+for its lightweight architecture, is a pre-trained model that has
+already learned valuable features from a large dataset (ImageNet).
+
+![](images_4/media/image9.jpg){width="6.5in"
+height="1.9305555555555556in"}
+
+By leveraging these learned features, you can train a new model for your
+specific task with fewer data and computational resources yet achieve
+competitive accuracy.
+
+![](images_4/media/image32.jpg){width="6.5in"
+height="2.3055555555555554in"}
+
+This approach significantly reduces training time and computational
+cost, making it ideal for quick prototyping and deployment on embedded
+devices where efficiency is paramount.
+
+Go to the Impulse Design Tab and create the *impulse*, defining an image
+size of 96x96 and squashing them (squared form, without crop). Select
+Image and Transfer Learning blocks. Save the Impulse.
+
+![](images_4/media/image16.png){width="6.5in"
+height="4.263888888888889in"}
+
+### **Image Pre-Processing**
+
+All input QVGA/RGB565 images will be converted to 27,640 features
+(96x96x3).
+
+![](images_4/media/image17.png){width="6.5in"
+height="4.319444444444445in"}
+
+Press \[Save parameters\] and Generate all features:
+
+![](images_4/media/image5.png){width="6.5in"
+height="4.263888888888889in"}
+
+### **Model Design**
+
+In 2007, Google introduced
+[[MobileNetV1]{.underline}](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html),
+a family of general-purpose computer vision neural networks designed
+with mobile devices in mind to support classification, detection, and
+more. MobileNets are small, low-latency, low-power models parameterized
+to meet the resource constraints of various use cases. in 2018, Google
+launched [MobileNetV2: Inverted Residuals and Linear
+Bottlenecks](https://arxiv.org/abs/1801.04381).
+
+MobileNet V1 and MobileNet V2 aim for mobile efficiency and embedded
+vision applications but differ in architectural complexity and
+performance. While both use depthwise separable convolutions to reduce
+the computational cost, MobileNet V2 introduces Inverted Residual Blocks
+and Linear Bottlenecks to enhance performance. These new features allow
+V2 to capture more complex features using fewer parameters, making it
+computationally more efficient and generally more accurate than its
+predecessor. Additionally, V2 employs a non-linear activation in the
+intermediate expansion layer. Still, it uses a linear activation for the
+bottleneck layer, a design choice found to preserve important
+information through the network better. MobileNet V2 offers a more
+optimized architecture for higher accuracy and efficiency and will be
+used in this project.
+
+Although the base MobileNet architecture is already tiny and has low
+latency, many times, a specific use case or application may require the
+model to be smaller and faster. MobileNets introduces a straightforward
+parameter α (alpha) called width multiplier to construct these smaller,
+less computationally expensive models. The role of the width multiplier
+α is to thin a network uniformly at each layer.
+
+Edge Impulse Studio has available MobileNetV1 (96x96 images) and V2
+(96x96 and 160x160 images), with several different **α** values (from
+0.05 to 1.0). For example, you will get the highest accuracy with V2,
+160x160 images, and α=1.0. Of course, there is a trade-off. The higher
+the accuracy, the more memory (around 1.3M RAM and 2.6M ROM) will be
+needed to run the model, implying more latency. The smaller footprint
+will be obtained at another extreme with MobileNetV1 and α=0.10 (around
+53.2K RAM and 101K ROM).
+
+![](images_4/media/image27.jpg){width="6.5in"
+height="3.5277777777777777in"}
+
+For this project, we will use **MobileNetV2 96x96 0.1**, which estimates
+a memory cost of 265.3 KB in RAM. This model should be OK for the Nicla
+Vision with 1MB of SRAM. On the Transfer Learning Tab, select this
+model:
+
+![](images_4/media/image24.png){width="6.5in"
+height="4.263888888888889in"}
+
+Another necessary technique to be used with Deep Learning is **Data
+Augmentation**. Data augmentation is a method that can help improve the
+accuracy of machine learning models, creating additional artificial
+data. A data augmentation system makes small, random changes to your
+training data during the training process (such as flipping, cropping,
+or rotating the images).
+
+Under the rood, here you can see how Edge Impulse implements a data
+Augmentation policy on your data:
+
+```{python}
+# Implements the data augmentation policy
+def augment_image(image, label):
+    # Flips the image randomly
+    image = tf.image.random_flip_left_right(image)
+
+    # Increase the image size, then randomly crop it down to
+    # the original dimensions
+    resize_factor = random.uniform(1, 1.2)
+    new_height = math.floor(resize_factor * INPUT_SHAPE[0])
+    new_width = math.floor(resize_factor * INPUT_SHAPE[1])
+    image = tf.image.resize_with_crop_or_pad(image, new_height, new_width)
+    image = tf.image.random_crop(image, size=INPUT_SHAPE)
+
+    # Vary the brightness of the image
+    image = tf.image.random_brightness(image, max_delta=0.2)
+
+    return image, label
+
+```
+Exposure to these variations during training can help prevent your model
+from taking shortcuts by \"memorizing\" superficial clues in your
+training data, meaning it may better reflect the deep underlying
+patterns in your dataset.
+
+The final layer of our model will have 12 neurons with a 15% dropout for
+overfitting prevention. Here is the Training result:
+
+![](images_4/media/image31.jpg){width="6.5in"
+height="3.5in"}
+
+The result is excellent, with 77ms of latency, which should result in
+13fps (frames per second) during inference.
+
+### **Model Testing**
+
+![](images_4/media/image10.jpg){width="6.5in"
+height="3.8472222222222223in"}
+
+Now, you should take the data put apart at the start of the project and
+run the trained model having them as input:
+
+![](images_4/media/image34.png){width="3.1041666666666665in"
+height="1.7083333333333333in"}
+
+The result was, again, excellent.
+
+![](images_4/media/image12.png){width="6.5in"
+height="4.263888888888889in"}
+
+### **Deploying the model**
+
+At this point, we can deploy the trained model as.tflite and use the
+OpenMV IDE to run it using MicroPython, or we can deploy it as a C/C++
+or an Arduino library.
+
+![](images_4/media/image28.jpg){width="6.5in"
+height="3.763888888888889in"}
+
+**Arduino Library**
+
+First, Let\'s deploy it as an Arduino Library:
+
+![](images_4/media/image48.png){width="6.5in"
+height="4.263888888888889in"}
+
+You should install the library as.zip on the Arduino IDE and run the
+sketch nicla_vision_camera.ino available in Examples under your library
+name.
+
+> *Note that Arduino Nicla Vision has, by default, 512KB of RAM
+> allocated for the M7 core and an additional 244KB on the M4 address
+> space. In the code, this allocation was changed to 288 kB to guarantee
+> that the model will run on the device
+> (malloc_addblock((void\*)0x30000000, 288 \* 1024);).*
+
+The result was good, with 86ms of measured latency.
+
+![](images_4/media/image25.jpg){width="6.5in"
+height="3.4444444444444446in"}
+
+Here is a short video showing the inference results:
+[[https://youtu.be/bZPZZJblU-o]{.underline}](https://youtu.be/bZPZZJblU-o)
+
+**OpenMV**
+
+It is possible to deploy the trained model to be used with OpenMV in two
+ways: as a library and as a firmware.
+
+Three files are generated as a library: the.tflite model, a list with
+the labels, and a simple MicroPython script that can make inferences
+using the model.
+
+![](images_4/media/image26.png){width="6.5in"
+height="1.0in"}
+
+Running this model as a.tflite directly in the Nicla was impossible. So,
+we can sacrifice the accuracy using a smaller model or deploy the model
+as an OpenMV Firmware (FW). As an FW, the Edge Impulse Studio generates
+optimized models, libraries, and frameworks needed to make the
+inference. Let\'s explore this last one.
+
+Select OpenMV Firmware on the Deploy Tab and press \[Build\].
+
+![](images_4/media/image3.png){width="6.5in"
+height="4.263888888888889in"}
+
+On your computer, you will find a ZIP file. Open it:
+
+![](images_4/media/image33.png){width="6.5in"
+height="2.625in"}
+
+Use the Bootloader tool on the OpenMV IDE to load the FW on your board:
+
+![](images_4/media/image35.jpg){width="6.5in"
+height="3.625in"}
+
+Select the appropriate file (.bin for Nicla-Vision):
+
+![](images_4/media/image8.png){width="6.5in"
+height="1.9722222222222223in"}
+
+After the download is finished, press OK:
+
+![](images_4/media/image40.png){width="3.875in"
+height="5.708333333333333in"}
+
+If a message says that the FW is outdated, DO NOT UPGRADE. Select
+\[NO\].
+
+![](images_4/media/image42.png){width="4.572916666666667in"
+height="2.875in"}
+
+Now, open the script **ei_image_classification.py** that was downloaded
+from the Studio and the.bin file for the Nicla.
+
+![](images_4/media/image14.png){width="6.5in"
+height="4.0in"}
+
+And run it. Pointing the camera to the objects we want to classify, the
+inference result will be displayed on the Serial Terminal.
+
+![](images_4/media/image37.png){width="6.5in"
+height="3.736111111111111in"}
+
+**Changing Code to add labels:**
+
+The code provided by Edge Impulse can be modified so that we can see,
+for test reasons, the inference result directly on the image displayed
+on the OpenMV IDE.
+
+[[Upload the code from
+GitHub,]{.underline}](https://github.com/Mjrovai/Arduino_Nicla_Vision/blob/main/Micropython/nicla_image_classification.py)
+or modify it as below:
+
+```{python}
+# Marcelo Rovai - NICLA Vision - Image Classification
+# Adapted from Edge Impulse - OpenMV Image Classification Example
+# @24Aug23
+
+import sensor, image, time, os, tf, uos, gc
+
+sensor.reset()                         # Reset and initialize the sensor.
+sensor.set_pixformat(sensor.RGB565)    # Set pxl fmt to RGB565 (or GRAYSCALE)
+sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
+sensor.set_windowing((240, 240))       # Set 240x240 window.
+sensor.skip_frames(time=2000)          # Let the camera adjust.
+
+net = None
+labels = None
+
+try:
+    # Load built in model
+    labels, net = tf.load_builtin_model('trained')
+except Exception as e:
+    raise Exception(e)
+
+clock = time.clock()
+while(True):
+    clock.tick()  # Starts tracking elapsed time.
+
+    img = sensor.snapshot()
+
+    # default settings just do one detection
+    for obj in net.classify(img, 
+                            min_scale=1.0, 
+                            scale_mul=0.8, 
+                            x_overlap=0.5, 
+                            y_overlap=0.5):
+        fps = clock.fps()
+        lat = clock.avg()
+
+        print("**********\nPrediction:")
+        img.draw_rectangle(obj.rect())
+        # This combines the labels and confidence values into a list of tuples
+        predictions_list = list(zip(labels, obj.output()))
+
+        max_val = predictions_list[0][1]
+        max_lbl = 'background'
+        for i in range(len(predictions_list)):
+            val = predictions_list[i][1]
+            lbl = predictions_list[i][0]
+
+            if val > max_val:
+                max_val = val
+                max_lbl = lbl
+
+    # Print label with the highest probability
+    if max_val < 0.5:
+        max_lbl = 'uncertain'
+    print("{} with a prob of {:.2f}".format(max_lbl, max_val))
+    print("FPS: {:.2f} fps ==> latency: {:.0f} ms".format(fps, lat))
+
+    # Draw label with highest probability to image viewer
+    img.draw_string(
+        10, 10,
+        max_lbl + "\n{:.2f}".format(max_val),
+        mono_space = False,
+        scale=2
+        )
+
+```
+
+Here you can see the result:
+
+![](images_4/media/image47.jpg){width="6.5in"
+height="2.9444444444444446in"}
+
+Note that the latency (136 ms) is almost double what we got directly
+with the Arduino IDE. This is because we are using the IDE as an
+interface and the time to wait for the camera to be ready. If we start
+the clock just before the inference:
+
+![](images_4/media/image13.jpg){width="6.5in"
+height="2.0972222222222223in"}
+
+The latency will drop to only 71 ms.
+
+![](images_4/media/image1.jpg){width="3.5520833333333335in"
+height="1.53125in"}
+
+> *The NiclaV runs about half as fast when connected to the IDE. The FPS should increase once disconnected.*
+
+### **Post-Processing with LEDs**
+
+When working with embedded machine learning, we are looking for devices
+that can continually proceed with the inference and result, taking some
+action directly on the physical world and not displaying the result on a
+connected computer. To simulate this, we will define one LED to light up
+for each one of the possible inference results.
+
+![](images_4/media/image38.jpg){width="6.5in"
+height="3.236111111111111in"}
+
+For that, we should [[upload the code from
+GitHub]{.underline}](https://github.com/Mjrovai/Arduino_Nicla_Vision/blob/main/Micropython/nicla_image_classification_LED.py)
+or change the last code to include the LEDs:
+
+```{python}
+# Marcelo Rovai - NICLA Vision - Image Classification with LEDs
+# Adapted from Edge Impulse - OpenMV Image Classification Example
+# @24Aug23
+
+import sensor, image, time, os, tf, uos, gc, pyb
+
+ledRed = pyb.LED(1)
+ledGre = pyb.LED(2)
+ledBlu = pyb.LED(3)
+
+sensor.reset()                         # Reset and initialize the sensor.
+sensor.set_pixformat(sensor.RGB565)    # Set pixl fmt to RGB565 (or GRAYSCALE)
+sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
+sensor.set_windowing((240, 240))       # Set 240x240 window.
+sensor.skip_frames(time=2000)          # Let the camera adjust.
+
+net = None
+labels = None
+
+ledRed.off()
+ledGre.off()
+ledBlu.off()
+
+try:
+    # Load built in model
+    labels, net = tf.load_builtin_model('trained')
+except Exception as e:
+    raise Exception(e)
+
+clock = time.clock()
+
+
+def setLEDs(max_lbl):
+
+    if max_lbl == 'uncertain':
+        ledRed.on()
+        ledGre.off()
+        ledBlu.off()
+
+    if max_lbl == 'periquito':
+        ledRed.off()
+        ledGre.on()
+        ledBlu.off()
+
+    if max_lbl == 'robot':
+        ledRed.off()
+        ledGre.off()
+        ledBlu.on()
+
+    if max_lbl == 'background':
+        ledRed.off()
+        ledGre.off()
+        ledBlu.off()
+
+
+while(True):
+    img = sensor.snapshot()
+    clock.tick()  # Starts tracking elapsed time.
+
+    # default settings just do one detection.
+    for obj in net.classify(img, 
+                            min_scale=1.0, 
+                            scale_mul=0.8, 
+                            x_overlap=0.5, 
+                            y_overlap=0.5):
+        fps = clock.fps()
+        lat = clock.avg()
+
+        print("**********\nPrediction:")
+        img.draw_rectangle(obj.rect())
+        # This combines the labels and confidence values into a list of tuples
+        predictions_list = list(zip(labels, obj.output()))
+
+        max_val = predictions_list[0][1]
+        max_lbl = 'background'
+        for i in range(len(predictions_list)):
+            val = predictions_list[i][1]
+            lbl = predictions_list[i][0]
+
+            if val > max_val:
+                max_val = val
+                max_lbl = lbl
+
+    # Print label and turn on LED with the highest probability
+    if max_val < 0.8:
+        max_lbl = 'uncertain'
+
+    setLEDs(max_lbl)
+
+    print("{} with a prob of {:.2f}".format(max_lbl, max_val))
+    print("FPS: {:.2f} fps ==> latency: {:.0f} ms".format(fps, lat))
+
+    # Draw label with highest probability to image viewer
+    img.draw_string(
+        10, 10,
+        max_lbl + "\n{:.2f}".format(max_val),
+        mono_space = False,
+        scale=2
+        )
+
+```
+
+Now, each time that a class gets a result superior of 0.8, the
+correspondent LED will be light on as below:
+
+-   Led Red 0n: Uncertain (no one class is over 0.8)
+
+-   Led Green 0n: Periquito \> 0.8
+
+-   Led Blue 0n: Robot \> 0.8
+
+-   All LEDs Off: Background \> 0.8
+
+Here is the result:
+
+![](images_4/media/image18.jpg){width="6.5in"
+height="3.6527777777777777in"}
+
+In more detail
+
+![](images_4/media/image21.jpg){width="6.5in"
+height="2.0972222222222223in"}
+
+### **Image Classification (non-official) Benchmark**
+
+Several development boards can be used for embedded machine learning
+(tinyML), and the most common ones for Computer Vision applications
+(with low energy), are the ESP32 CAM, the Seeed XIAO ESP32S3 Sense, the
+Arduinos Nicla Vison, and Portenta.
+
+![](images_4/media/image19.jpg){width="6.5in"
+height="4.194444444444445in"}
+
+Using the opportunity, the same trained model was deployed on the
+ESP-CAM, the XIAO, and Portenta (in this one, the model was trained
+again, using grayscaled images to be compatible with its camera. Here is
+the result, deploying the models as Arduino\'s Library:
+
+![](images_4/media/image4.jpg){width="6.5in"
+height="3.4444444444444446in"}
+
+### **Conclusion**
+
+Before we finish, consider that Computer Vision is more than just image
+classification. For example, you can develop Edge Machine Learning
+projects around vision in several areas, such as:
+
+-   **Autonomous Vehicles**: Use sensor fusion, lidar data, and computer
+    > vision algorithms to navigate and make decisions.
+
+-   **Healthcare**: Automated diagnosis of diseases through MRI, X-ray,
+    > and CT scan image analysis
+
+-   **Retail**: Automated checkout systems that identify products as
+    > they pass through a scanner.
+
+-   **Security and Surveillance**: Facial recognition, anomaly
+    > detection, and object tracking in real-time video feeds.
+
+-   **Augmented Reality**: Object detection and classification to
+    > overlay digital information in the real world.
+
+-   **Industrial Automation**: Visual inspection of products, predictive
+    > maintenance, and robot and drone guidance.
+
+-   **Agriculture**: Drone-based crop monitoring and automated
+    > harvesting.
+
+-   **Natural Language Processing**: Image captioning and visual
+    > question answering.
+
+-   **Gesture Recognition**: For gaming, sign language translation, and
+    > human-machine interaction.
+
+-   **Content Recommendation**: Image-based recommendation systems in
+    > e-commerce.
\ No newline at end of file

From d9d07f294c539cd8ee2089ab192d75509a2c519f Mon Sep 17 00:00:00 2001
From: Marcelo Rovai <Mjrovai@gmail.com>
Date: Fri, 6 Oct 2023 15:09:21 -0300
Subject: [PATCH 2/2] Delete  embedded_ml_exercise.qmd

---
  embedded_ml_exercise.qmd | 745 --------------------------------------
 1 file changed, 745 deletions(-)
 delete mode 100644  embedded_ml_exercise.qmd

diff --git a/ embedded_ml_exercise.qmd b/ embedded_ml_exercise.qmd
deleted file mode 100644
index d07ef188..00000000
--- a/ embedded_ml_exercise.qmd	
+++ /dev/null
@@ -1,745 +0,0 @@
----
-title: "4 Embedded AI - Exercise: Image Classification"
-format:
-  html:
-    code-fold: false
-execute:
-  eval: false
-jupyter: python3
----
-
-### **Introduction**
-
-As we initiate our studies into embedded machine learning or tinyML,
-it\'s impossible to overlook the transformative impact of Computer
-Vision (CV) and Artificial Intelligence (AI) in our lives. These two
-intertwined disciplines redefine what machines can perceive and
-accomplish, from autonomous vehicles and robotics to healthcare and
-surveillance.
-
-More and more, we are facing an artificial intelligence (AI) revolution
-where, as stated by Gartner, **Edge AI** has a very high impact
-potential, and **it is for now**!
-
-![](images_4/media/image2.jpg){width="4.729166666666667in"
-height="4.895833333333333in"}
-
-In the \"bull-eye\" of emerging technologies, radar is the *Edge
-Computer Vision*, and when we talk about Machine Learning (ML) applied
-to vision, the first thing that comes to mind is **Image
-Classification**, a kind of ML \"Hello World\"!
-
-This exercise will explore a computer vision project utilizing
-Convolutional Neural Networks (CNNs) for real-time image classification.
-Leveraging TensorFlow\'s robust ecosystem, we\'ll implement a
-pre-trained MobileNet model and adapt it for edge deployment. The focus
-will be optimizing the model to run efficiently on resource-constrained
-hardware without sacrificing accuracy.
-
-We\'ll employ techniques like quantization and pruning to reduce the
-computational load. By the end of this tutorial, you\'ll have a working
-prototype capable of classifying images in real time, all running on a
-low-power embedded system based on the Arduino Nicla Vision board.
-
-### **Computer Vision**
-
-At its core, computer vision aims to enable machines to interpret and
-make decisions based on visual data from the world---essentially
-mimicking the capability of the human optical system. Conversely, AI is
-a broader field encompassing machine learning, natural language
-processing, and robotics, among other technologies. When you bring AI
-algorithms into computer vision projects, you supercharge the system\'s
-ability to understand, interpret, and react to visual stimuli.
-
-When discussing Computer Vision projects applied to embedded devices,
-the most common applications that come to mind are *Image
-Classification* and *Object Detection*.
-
-![](images_4/media/image15.jpg){width="6.5in"
-height="2.8333333333333335in"}
-
-Both models can be implemented on tiny devices like the Arduino Nicla
-Vision and used on real projects. Let\'s start with the first one.
-
-### **Image Classification Project**
-
-The first step in any ML project is to define our goal. In this case, it
-is to detect and classify two specific objects present in one image. For
-this project, we will use two small toys: a *robot* and a small
-Brazilian parrot (named *Periquito*). Also, we will collect images of a
-*background* where those two objects are absent.
-
-![](images_4/media/image36.jpg){width="6.5in"
-height="3.638888888888889in"}
-
-### **Data Collection**
-
-Once you have defined your Machine Learning project goal, the next and
-most crucial step is the dataset collection. You can use the Edge
-Impulse Studio, the OpenMV IDE we installed, or even your phone for the
-image capture. Here, we will use the OpenMV IDE for that.
-
-**Collecting Dataset with OpenMV IDE**
-
-First, create in your computer a folder where your data will be saved,
-for example, \"data.\" Next, on the OpenMV IDE, go to Tools \> Dataset
-Editor and select New Dataset to start the dataset collection:
-
-![](images_4/media/image29.png){width="6.291666666666667in"
-height="4.010416666666667in"}
-
-The IDE will ask you to open the file where your data will be saved and
-choose the \"data\" folder that was created. Note that new icons will
-appear on the Left panel.
-
-![](images_4/media/image46.png){width="0.9583333333333334in"
-height="1.5208333333333333in"}
-
-Using the upper icon (1), enter with the first class name, for example,
-\"periquito\":
-
-![](images_4/media/image22.png){width="3.25in"
-height="2.65625in"}
-
-Run the dataset_capture_script.py, and clicking on the bottom icon (2),
-will start capturing images:
-
-![](images_4/media/image43.png){width="6.5in"
-height="4.041666666666667in"}
-
-Repeat the same procedure with the other classes
-
-![](images_4/media/image6.jpg){width="6.5in"
-height="3.0972222222222223in"}
-
-> *We suggest around 60 images from each category. Try to capture
-> different angles, backgrounds, and light conditions.*
-
-The stored images use a QVGA frame size 320x240 and RGB565 (color pixel
-format).
-
-After capturing your dataset, close the Dataset Editor Tool on the Tools
-\> Dataset Editor.
-
-On your computer, you will end with a dataset that contains three
-classes: periquito, robot, and background.
-
-![](images_4/media/image20.png){width="6.5in"
-height="2.2083333333333335in"}
-
-You should return to Edge Impulse Studio and upload the dataset to your
-project.
-
-### **Training the model with Edge Impulse Studio**
-
-We will use the Edge Impulse Studio for training our model. Enter your
-account credentials at Edge Impulse and create a new project:
-
-![](images_4/media/image45.png){width="6.5in"
-height="4.263888888888889in"}
-
-> *Here, you can clone a similar project:*
-> *[NICLA-Vision_Image_Classification](https://studio.edgeimpulse.com/public/273858/latest).*
-
-### **Dataset**
-
-Using the EI Studio (or *Studio*), we will pass over four main steps to
-have our model ready for use on the Nicla Vision board: Dataset,
-Impulse, Tests, and Deploy (on the Edge Device, in this case, the
-NiclaV).
-
-![](images_4/media/image41.jpg){width="6.5in"
-height="4.194444444444445in"}
-
-Regarding the Dataset, it is essential to point out that our Original
-Dataset, captured with the OpenMV IDE, will be split into three parts:
-Training, Validation, and Test. The Test Set will be divided from the
-beginning and left a part to be used only in the Test phase after
-training. The Validation Set will be used during training.
-
-![](images_4/media/image7.jpg){width="6.5in"
-height="4.763888888888889in"}
-
-On Studio, go to the Data acquisition tab, and on the UPLOAD DATA
-section, upload from your computer the files from chosen categories:
-
-![](images_4/media/image39.png){width="6.5in"
-height="4.263888888888889in"}
-
-Left to the Studio to automatically split the original dataset into
-training and test and choose the label related to that specific data:
-
-![](images_4/media/image30.png){width="6.5in"
-height="4.263888888888889in"}
-
-Repeat the procedure for all three classes. At the end, you should see
-your \"raw data in the Studio:
-
-![](images_4/media/image11.png){width="6.5in"
-height="4.263888888888889in"}
-
-The Studio allows you to explore your data, showing a complete view of
-all the data in your project. You can clear, inspect, or change labels
-by clicking on individual data items. In our case, a simple project, the
-data seems OK.
-
-![](images_4/media/image44.png){width="6.5in"
-height="4.263888888888889in"}
-
-### **The Impulse Design**
-
-In this phase, we should define how to:
-
--   Pre-process our data, which consists of resizing the individual
-    > images and determining the color depth to use (RGB or Grayscale)
-    > and
-
--   Design a Model that will be \"Transfer Learning (Images)\" to
-    > fine-tune a pre-trained MobileNet V2 image classification model on
-    > our data. This method performs well even with relatively small
-    > image datasets (around 150 images in our case).
-
-![](images_4/media/image23.jpg){width="6.5in"
-height="4.0in"}
-
-Transfer Learning with MobileNet offers a streamlined approach to model
-training, which is especially beneficial for resource-constrained
-environments and projects with limited labeled data. MobileNet, known
-for its lightweight architecture, is a pre-trained model that has
-already learned valuable features from a large dataset (ImageNet).
-
-![](images_4/media/image9.jpg){width="6.5in"
-height="1.9305555555555556in"}
-
-By leveraging these learned features, you can train a new model for your
-specific task with fewer data and computational resources yet achieve
-competitive accuracy.
-
-![](images_4/media/image32.jpg){width="6.5in"
-height="2.3055555555555554in"}
-
-This approach significantly reduces training time and computational
-cost, making it ideal for quick prototyping and deployment on embedded
-devices where efficiency is paramount.
-
-Go to the Impulse Design Tab and create the *impulse*, defining an image
-size of 96x96 and squashing them (squared form, without crop). Select
-Image and Transfer Learning blocks. Save the Impulse.
-
-![](images_4/media/image16.png){width="6.5in"
-height="4.263888888888889in"}
-
-### **Image Pre-Processing**
-
-All input QVGA/RGB565 images will be converted to 27,640 features
-(96x96x3).
-
-![](images_4/media/image17.png){width="6.5in"
-height="4.319444444444445in"}
-
-Press \[Save parameters\] and Generate all features:
-
-![](images_4/media/image5.png){width="6.5in"
-height="4.263888888888889in"}
-
-### **Model Design**
-
-In 2007, Google introduced
-[[MobileNetV1]{.underline}](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html),
-a family of general-purpose computer vision neural networks designed
-with mobile devices in mind to support classification, detection, and
-more. MobileNets are small, low-latency, low-power models parameterized
-to meet the resource constraints of various use cases. in 2018, Google
-launched [MobileNetV2: Inverted Residuals and Linear
-Bottlenecks](https://arxiv.org/abs/1801.04381).
-
-MobileNet V1 and MobileNet V2 aim for mobile efficiency and embedded
-vision applications but differ in architectural complexity and
-performance. While both use depthwise separable convolutions to reduce
-the computational cost, MobileNet V2 introduces Inverted Residual Blocks
-and Linear Bottlenecks to enhance performance. These new features allow
-V2 to capture more complex features using fewer parameters, making it
-computationally more efficient and generally more accurate than its
-predecessor. Additionally, V2 employs a non-linear activation in the
-intermediate expansion layer. Still, it uses a linear activation for the
-bottleneck layer, a design choice found to preserve important
-information through the network better. MobileNet V2 offers a more
-optimized architecture for higher accuracy and efficiency and will be
-used in this project.
-
-Although the base MobileNet architecture is already tiny and has low
-latency, many times, a specific use case or application may require the
-model to be smaller and faster. MobileNets introduces a straightforward
-parameter α (alpha) called width multiplier to construct these smaller,
-less computationally expensive models. The role of the width multiplier
-α is to thin a network uniformly at each layer.
-
-Edge Impulse Studio has available MobileNetV1 (96x96 images) and V2
-(96x96 and 160x160 images), with several different **α** values (from
-0.05 to 1.0). For example, you will get the highest accuracy with V2,
-160x160 images, and α=1.0. Of course, there is a trade-off. The higher
-the accuracy, the more memory (around 1.3M RAM and 2.6M ROM) will be
-needed to run the model, implying more latency. The smaller footprint
-will be obtained at another extreme with MobileNetV1 and α=0.10 (around
-53.2K RAM and 101K ROM).
-
-![](images_4/media/image27.jpg){width="6.5in"
-height="3.5277777777777777in"}
-
-For this project, we will use **MobileNetV2 96x96 0.1**, which estimates
-a memory cost of 265.3 KB in RAM. This model should be OK for the Nicla
-Vision with 1MB of SRAM. On the Transfer Learning Tab, select this
-model:
-
-![](images_4/media/image24.png){width="6.5in"
-height="4.263888888888889in"}
-
-Another necessary technique to be used with Deep Learning is **Data
-Augmentation**. Data augmentation is a method that can help improve the
-accuracy of machine learning models, creating additional artificial
-data. A data augmentation system makes small, random changes to your
-training data during the training process (such as flipping, cropping,
-or rotating the images).
-
-Under the rood, here you can see how Edge Impulse implements a data
-Augmentation policy on your data:
-
-```{python}
-# Implements the data augmentation policy
-def augment_image(image, label):
-    # Flips the image randomly
-    image = tf.image.random_flip_left_right(image)
-
-    # Increase the image size, then randomly crop it down to
-    # the original dimensions
-    resize_factor = random.uniform(1, 1.2)
-    new_height = math.floor(resize_factor * INPUT_SHAPE[0])
-    new_width = math.floor(resize_factor * INPUT_SHAPE[1])
-    image = tf.image.resize_with_crop_or_pad(image, new_height, new_width)
-    image = tf.image.random_crop(image, size=INPUT_SHAPE)
-
-    # Vary the brightness of the image
-    image = tf.image.random_brightness(image, max_delta=0.2)
-
-    return image, label
-
-```
-Exposure to these variations during training can help prevent your model
-from taking shortcuts by \"memorizing\" superficial clues in your
-training data, meaning it may better reflect the deep underlying
-patterns in your dataset.
-
-The final layer of our model will have 12 neurons with a 15% dropout for
-overfitting prevention. Here is the Training result:
-
-![](images_4/media/image31.jpg){width="6.5in"
-height="3.5in"}
-
-The result is excellent, with 77ms of latency, which should result in
-13fps (frames per second) during inference.
-
-### **Model Testing**
-
-![](images_4/media/image10.jpg){width="6.5in"
-height="3.8472222222222223in"}
-
-Now, you should take the data put apart at the start of the project and
-run the trained model having them as input:
-
-![](images_4/media/image34.png){width="3.1041666666666665in"
-height="1.7083333333333333in"}
-
-The result was, again, excellent.
-
-![](images_4/media/image12.png){width="6.5in"
-height="4.263888888888889in"}
-
-### **Deploying the model**
-
-At this point, we can deploy the trained model as.tflite and use the
-OpenMV IDE to run it using MicroPython, or we can deploy it as a C/C++
-or an Arduino library.
-
-![](images_4/media/image28.jpg){width="6.5in"
-height="3.763888888888889in"}
-
-**Arduino Library**
-
-First, Let\'s deploy it as an Arduino Library:
-
-![](images_4/media/image48.png){width="6.5in"
-height="4.263888888888889in"}
-
-You should install the library as.zip on the Arduino IDE and run the
-sketch nicla_vision_camera.ino available in Examples under your library
-name.
-
-> *Note that Arduino Nicla Vision has, by default, 512KB of RAM
-> allocated for the M7 core and an additional 244KB on the M4 address
-> space. In the code, this allocation was changed to 288 kB to guarantee
-> that the model will run on the device
-> (malloc_addblock((void\*)0x30000000, 288 \* 1024);).*
-
-The result was good, with 86ms of measured latency.
-
-![](images_4/media/image25.jpg){width="6.5in"
-height="3.4444444444444446in"}
-
-Here is a short video showing the inference results:
-[[https://youtu.be/bZPZZJblU-o]{.underline}](https://youtu.be/bZPZZJblU-o)
-
-**OpenMV**
-
-It is possible to deploy the trained model to be used with OpenMV in two
-ways: as a library and as a firmware.
-
-Three files are generated as a library: the.tflite model, a list with
-the labels, and a simple MicroPython script that can make inferences
-using the model.
-
-![](images_4/media/image26.png){width="6.5in"
-height="1.0in"}
-
-Running this model as a.tflite directly in the Nicla was impossible. So,
-we can sacrifice the accuracy using a smaller model or deploy the model
-as an OpenMV Firmware (FW). As an FW, the Edge Impulse Studio generates
-optimized models, libraries, and frameworks needed to make the
-inference. Let\'s explore this last one.
-
-Select OpenMV Firmware on the Deploy Tab and press \[Build\].
-
-![](images_4/media/image3.png){width="6.5in"
-height="4.263888888888889in"}
-
-On your computer, you will find a ZIP file. Open it:
-
-![Pasted Graphic
-64.png](images_4/media/image33.png){width="6.5in"
-height="2.625in"}
-
-Use the Bootloader tool on the OpenMV IDE to load the FW on your board:
-
-![Pasted Graphic
-63.png](images_4/media/image35.jpg){width="6.5in"
-height="3.625in"}
-
-Select the appropriate file (.bin for Nicla-Vision):
-
-![Pasted Graphic
-65.png](images_4/media/image8.png){width="6.5in"
-height="1.9722222222222223in"}
-
-After the download is finished, press OK:
-
-![DFU firmware update
-complete!.png](images_4/media/image40.png){width="3.875in"
-height="5.708333333333333in"}
-
-If a message says that the FW is outdated, DO NOT UPGRADE. Select
-\[NO\].
-
-![](images_4/media/image42.png){width="4.572916666666667in"
-height="2.875in"}
-
-Now, open the script **ei_image_classification.py** that was downloaded
-from the Studio and the.bin file for the Nicla.
-
-![](images_4/media/image14.png){width="6.5in"
-height="4.0in"}
-
-And run it. Pointing the camera to the objects we want to classify, the
-inference result will be displayed on the Serial Terminal.
-
-![](images_4/media/image37.png){width="6.5in"
-height="3.736111111111111in"}
-
-**Changing Code to add labels:**
-
-The code provided by Edge Impulse can be modified so that we can see,
-for test reasons, the inference result directly on the image displayed
-on the OpenMV IDE.
-
-[[Upload the code from
-GitHub,]{.underline}](https://github.com/Mjrovai/Arduino_Nicla_Vision/blob/main/Micropython/nicla_image_classification.py)
-or modify it as below:
-
-```{python}
-# Marcelo Rovai - NICLA Vision - Image Classification
-# Adapted from Edge Impulse - OpenMV Image Classification Example
-# @24Aug23
-
-import sensor, image, time, os, tf, uos, gc
-
-sensor.reset()                         # Reset and initialize the sensor.
-sensor.set_pixformat(sensor.RGB565)    # Set pxl fmt to RGB565 (or GRAYSCALE)
-sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
-sensor.set_windowing((240, 240))       # Set 240x240 window.
-sensor.skip_frames(time=2000)          # Let the camera adjust.
-
-net = None
-labels = None
-
-try:
-    # Load built in model
-    labels, net = tf.load_builtin_model('trained')
-except Exception as e:
-    raise Exception(e)
-
-clock = time.clock()
-while(True):
-    clock.tick()  # Starts tracking elapsed time.
-
-    img = sensor.snapshot()
-
-    # default settings just do one detection
-    for obj in net.classify(img, 
-                            min_scale=1.0, 
-                            scale_mul=0.8, 
-                            x_overlap=0.5, 
-                            y_overlap=0.5):
-        fps = clock.fps()
-        lat = clock.avg()
-
-        print("**********\nPrediction:")
-        img.draw_rectangle(obj.rect())
-        # This combines the labels and confidence values into a list of tuples
-        predictions_list = list(zip(labels, obj.output()))
-
-        max_val = predictions_list[0][1]
-        max_lbl = 'background'
-        for i in range(len(predictions_list)):
-            val = predictions_list[i][1]
-            lbl = predictions_list[i][0]
-
-            if val > max_val:
-                max_val = val
-                max_lbl = lbl
-
-    # Print label with the highest probability
-    if max_val < 0.5:
-        max_lbl = 'uncertain'
-    print("{} with a prob of {:.2f}".format(max_lbl, max_val))
-    print("FPS: {:.2f} fps ==> latency: {:.0f} ms".format(fps, lat))
-
-    # Draw label with highest probability to image viewer
-    img.draw_string(
-        10, 10,
-        max_lbl + "\n{:.2f}".format(max_val),
-        mono_space = False,
-        scale=2
-        )
-
-```
-
-Here you can see the result:
-
-![](images_4/media/image47.jpg){width="6.5in"
-height="2.9444444444444446in"}
-
-Note that the latency (136 ms) is almost double what we got directly
-with the Arduino IDE. This is because we are using the IDE as an
-interface and the time to wait for the camera to be ready. If we start
-the clock just before the inference:
-
-![](images_4/media/image13.jpg){width="6.5in"
-height="2.0972222222222223in"}
-
-The latency will drop to only 71 ms.
-
-![](images_4/media/image1.jpg){width="3.5520833333333335in"
-height="1.53125in"}
-
-### ***OpenMV Cam runs about half as fast when connected to the IDE. The FPS should increase once disconnected.***
-
-### **Post-Processing with LEDs**
-
-When working with embedded machine learning, we are looking for devices
-that can continually proceed with the inference and result, taking some
-action directly on the physical world and not displaying the result on a
-connected computer. To simulate this, we will define one LED to light up
-for each one of the possible inference results.
-
-![](images_4/media/image38.jpg){width="6.5in"
-height="3.236111111111111in"}
-
-For that, we should [[upload the code from
-GitHub]{.underline}](https://github.com/Mjrovai/Arduino_Nicla_Vision/blob/main/Micropython/nicla_image_classification_LED.py)
-or change the last code to include the LEDs:
-
-```{python}
-# Marcelo Rovai - NICLA Vision - Image Classification with LEDs
-# Adapted from Edge Impulse - OpenMV Image Classification Example
-# @24Aug23
-
-import sensor, image, time, os, tf, uos, gc, pyb
-
-ledRed = pyb.LED(1)
-ledGre = pyb.LED(2)
-ledBlu = pyb.LED(3)
-
-sensor.reset()                         # Reset and initialize the sensor.
-sensor.set_pixformat(sensor.RGB565)    # Set pixl fmt to RGB565 (or GRAYSCALE)
-sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
-sensor.set_windowing((240, 240))       # Set 240x240 window.
-sensor.skip_frames(time=2000)          # Let the camera adjust.
-
-net = None
-labels = None
-
-ledRed.off()
-ledGre.off()
-ledBlu.off()
-
-try:
-    # Load built in model
-    labels, net = tf.load_builtin_model('trained')
-except Exception as e:
-    raise Exception(e)
-
-clock = time.clock()
-
-
-def setLEDs(max_lbl):
-
-    if max_lbl == 'uncertain':
-        ledRed.on()
-        ledGre.off()
-        ledBlu.off()
-
-    if max_lbl == 'periquito':
-        ledRed.off()
-        ledGre.on()
-        ledBlu.off()
-
-    if max_lbl == 'robot':
-        ledRed.off()
-        ledGre.off()
-        ledBlu.on()
-
-    if max_lbl == 'background':
-        ledRed.off()
-        ledGre.off()
-        ledBlu.off()
-
-
-while(True):
-    img = sensor.snapshot()
-    clock.tick()  # Starts tracking elapsed time.
-
-    # default settings just do one detection.
-    for obj in net.classify(img, 
-                            min_scale=1.0, 
-                            scale_mul=0.8, 
-                            x_overlap=0.5, 
-                            y_overlap=0.5):
-        fps = clock.fps()
-        lat = clock.avg()
-
-        print("**********\nPrediction:")
-        img.draw_rectangle(obj.rect())
-        # This combines the labels and confidence values into a list of tuples
-        predictions_list = list(zip(labels, obj.output()))
-
-        max_val = predictions_list[0][1]
-        max_lbl = 'background'
-        for i in range(len(predictions_list)):
-            val = predictions_list[i][1]
-            lbl = predictions_list[i][0]
-
-            if val > max_val:
-                max_val = val
-                max_lbl = lbl
-
-    # Print label and turn on LED with the highest probability
-    if max_val < 0.8:
-        max_lbl = 'uncertain'
-
-    setLEDs(max_lbl)
-
-    print("{} with a prob of {:.2f}".format(max_lbl, max_val))
-    print("FPS: {:.2f} fps ==> latency: {:.0f} ms".format(fps, lat))
-
-    # Draw label with highest probability to image viewer
-    img.draw_string(
-        10, 10,
-        max_lbl + "\n{:.2f}".format(max_val),
-        mono_space = False,
-        scale=2
-        )
-
-```
-
-Now, each time that a class gets a result superior of 0.8, the
-correspondent LED will be light on as below:
-
--   Led Red 0n: Uncertain (no one class is over 0.8)
-
--   Led Green 0n: Periquito \> 0.8
-
--   Led Blue 0n: Robot \> 0.8
-
--   All LEDs Off: Background \> 0.8
-
-Here is the result:
-
-![](images_4/media/image18.jpg){width="6.5in"
-height="3.6527777777777777in"}
-
-In more detail
-
-![](images_4/media/image21.jpg){width="6.5in"
-height="2.0972222222222223in"}
-
-### **Image Classification (non-official) Benchmark**
-
-Several development boards can be used for embedded machine learning
-(tinyML), and the most common ones for Computer Vision applications
-(with low energy), are the ESP32 CAM, the Seeed XIAO ESP32S3 Sense, the
-Arduinos Nicla Vison, and Portenta.
-
-![](images_4/media/image19.jpg){width="6.5in"
-height="4.194444444444445in"}
-
-Using the opportunity, the same trained model was deployed on the
-ESP-CAM, the XIAO, and Portenta (in this one, the model was trained
-again, using grayscaled images to be compatible with its camera. Here is
-the result, deploying the models as Arduino\'s Library:
-
-![](images_4/media/image4.jpg){width="6.5in"
-height="3.4444444444444446in"}
-
-### **Conclusion**
-
-Before we finish, consider that Computer Vision is more than just image
-classification. For example, you can develop Edge Machine Learning
-projects around vision in several areas, such as:
-
--   **Autonomous Vehicles**: Use sensor fusion, lidar data, and computer
-    > vision algorithms to navigate and make decisions.
-
--   **Healthcare**: Automated diagnosis of diseases through MRI, X-ray,
-    > and CT scan image analysis
-
--   **Retail**: Automated checkout systems that identify products as
-    > they pass through a scanner.
-
--   **Security and Surveillance**: Facial recognition, anomaly
-    > detection, and object tracking in real-time video feeds.
-
--   **Augmented Reality**: Object detection and classification to
-    > overlay digital information in the real world.
-
--   **Industrial Automation**: Visual inspection of products, predictive
-    > maintenance, and robot and drone guidance.
-
--   **Agriculture**: Drone-based crop monitoring and automated
-    > harvesting.
-
--   **Natural Language Processing**: Image captioning and visual
-    > question answering.
-
--   **Gesture Recognition**: For gaming, sign language translation, and
-    > human-machine interaction.
-
--   **Content Recommendation**: Image-based recommendation systems in
-    > e-commerce.