This master's degree project is about creating a system to recognize Russian Sign Language (RSL) gestures using EMG sensors and inertial sensor data. The dataset includes the RSL dactyl alphabet and basic phrases like "Привет" (Hello), "Спасибо" (Thanks), and "До свидания" (Goodbye).
- Project Overview
- System Architecture
- Hardware Components
- Data Collection
- Data Processing
- Machine Learning Models
- Results
- Conclusion
- Scripts Overview
- Configuration
- Running the Project
- STM32 Submodule
- Contributions
The goal of this project is to develop a system that can accurately recognize gestures from the Russian Sign Language using signals from EMG and inertial sensors. The system processes these signals to extract meaningful features and classifies them using machine learning models.
The system is divided into three main modules:
- Sensor Module: Consists of the STM32F103RB Nucleo board and the sensors.
- Signal Processing Module: Includes preprocessing and feature extraction from time and frequency domains.
- Classification Module: Involves training and using machine learning models for gesture recognition.
- STM32F103RB Board: (FW source repo)
- Grove-EMG Sensors: 4 pcs.
- MPU6050 Sensor: Accelerometer + Gyroscope.
- Normalization: Max-min normalization with static levels
- Filtering: Band-Pass and Low-Pass Butterworth filters
Features are extracted from both time and frequency domains.
- Energy (EN)
- Mean Absolute Value (MAV)
- Mean Absolute Deviation (MAD)
- Waveform Length (WL)
- Standard Deviation (STD)
- Slope Sign Change (SSC)
- Zero Crossing (ZC)
- Root Mean Square (RMS)
- Number of Peaks (NP)
- Skewness (SKEW)
- Kurtosis (KURT)
- Variance (VAR)
- Wilson Amplitude (WA)
- Percentile (PERC)
- Integral Absolute Value (IAV)
- Mean Frequency (MNF)
- Median Frequency (MDF)
- Mean Power (MNP)
- Total Power (TTP)
- Peak Frequency (PKF)
- Spectral Entropy (SE)
- Frequency Ratio (FR)
- Power Spectrum Ratio (PSR)
Features are selected by correlation matrix and mutual information method.
Multiple machine learning models were trained using the extracted features:
- K-Nearest Neighbors (KNN)
- Support Vector Machine (SVM)
- Linear Discriminant Analysis (LDA)
- Decision Tree (DT)
- Gradient Boosting (GB)
- Random Forest (RF)
- Naive Bayes (NB)
The dataset was recorded over three sessions, each separated by a day. For each gesture, 125 files were recorded, divided into 80% for training and 20% for testing.
-
EMG Sensors (muscles):
- Surface Flexor Digitorum
- Long Flexor of the Thumb
- Finger Extensor
- Short Extensor of the Thumb
-
Inertial Sensor:
- Outer side of the hand
All models were trained and tested on the test dataset (20% of the total dataset). There are accuracy metrics values below:
Model | Accuracy |
---|---|
Decision Tree | 0.983 |
Gradient Boosting | 0.994 |
KNN | 0.965 |
LDA | 0.996 |
Naive Bayes | 0.989 |
Random Forest | 0.994 |
SVM | 0.991 |
The LDA model, having the highest accuracy, was selected for real-time testing.
For each gesture, 30 repetitions were recorded, resulting in a confusion matrix.
Real-time metrics values:
Metric | Value |
---|---|
Accuracy | 0.994 |
Recall | 0.889 |
Precision | 0.91 |
F1 Score | 0.892 |
The results indicate a high accuracy of 0.994 in gesture recognition. Notably, the inertial sensor had the greatest impact on classification performance, while the EMG sensor was found to be less significant. This suggests that, although the EMG data contributes to the overall system, it may not provide the robust information needed for reliable classification.
Records dataset.
- Opens a matplotlib window for sensor data visualization
- Allows for gesture selection, file saving, and deletion
Views recorded dataset files.
- Matplotlib visualization with gesture and file navigation
Trains machine learning models on the writed dataset.
- Saves model info file
- Saves .pkl model file
- Saves .csv file with confusion matrix and performance metrics
Real-time gesture recognition.
- Displays sensor data using matplotlib.
- Outputs recognized gesture and confidence score.
System parameters, including filters settings, features parameters, gestures set, sampling rate, data channels, model file, etc. are defined in emg_parameters.json
.
This project is configured to be run in Visual Studio Code (VSCode). Each script has a corresponding task in tasks.json
, which runs the script through main.sh
. The main.sh
script sets up a virtual environment with all necessary modules specified in requirements.txt
.
- Install [email protected]: It doesn't work with eralier versions.
- Open VSCode: Open the project folder in VSCode.
- Configure System: Edit
emg_parameters.json
as needed. - Run Task: From the VSCode menu, go to
Terminal > Run Task
, and choose the desired task.
The list of tasks configured in VSCode:
The project includes a submodule for the STM32 Firmware source, which handles sensor data acquisition and transmission. You can find the repo here.