Skip to content

Commit

Permalink
add sidenotes to frameworks and cleaned up bug fixes for sidenotes in…
Browse files Browse the repository at this point in the history
… previous chapters
  • Loading branch information
elizakimball committed Nov 14, 2024
1 parent 33b8362 commit e054337
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 21 deletions.
3 changes: 3 additions & 0 deletions contents/core/data_engineering/data_engineering.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ In this context, using KWS as an example, we can break each of the steps out as
* Environmental Challenges: Devices might be deployed in various environments, from quiet bedrooms to noisy industrial settings. The KWS system must be robust enough to function effectively across these scenarios.

[^island]: The always-on island of the SoC refers to a subsystem that is specialized to handle low-power, always-on tasks within the embedded device such as wake-up commands. It continuously monitors specific sensors and controls the power management functions to wake up various components of the deice when necessary. By allowing different power states for various components, the always-on island ensures efficient energy usage and quick response time.

6. **Data Collection and Analysis:**
For a KWS system, the quality and diversity of data are paramount. Considerations might include:
* Variety of Accents: Collect data from speakers with various accents to ensure wide-ranging recognition.
Expand All @@ -125,6 +126,7 @@ In this context, using KWS as an example, we can break each of the steps out as
Once a prototype KWS system is developed, it's crucial to test it in real-world scenarios[^user-input], gather feedback, and iteratively refine the model. This ensures that the system remains aligned with the defined problem and objectives. This is important because the deployment scenarios change over time as things evolve.

[^user-input]: When refining a model based on user input, it is essential to ensure privacy laws and regulations are followed. Additionally, the real-world environment may not be representative of the broader population which can introduce biases into the system.

:::{#exr-kws .callout-caution collapse="true"}

### Keyword Spotting with TensorFlow Lite Micro
Expand Down Expand Up @@ -234,6 +236,7 @@ Many embedded use cases deal with unique situations, such as manufacturing plant
While synthetic data offers numerous advantages, it is essential to use it judiciously[^synethic-balance]. Care must be taken to ensure that the generated data accurately represents the underlying real-world distributions and does not introduce unintended biases.

[^synethic-balance]: Synthetic data should be balanced with real-world data to ensure models remain reliable. If ML models are overly trained on synthetic data, the outputs may become nonsensical and the model may collapse.

:::{#exr-sd .callout-caution collapse="true"}

### Synthetic Data
Expand Down
Loading

0 comments on commit e054337

Please sign in to comment.