Motion gesture recognition using Convolutional Neural Network
For this project I used the EMNIST (Extended MNIST) dataset. It is freely available on the Internet. You should read about it here https://arxiv.org/pdf/1702.05373.pdf. If you want to download the original dataset click here.
For this project I am using only the 'letters' dataset.
- Python 3.x
- Keras with Tensorflow as backend
- OpenCV 3.4
- h5py
- Jupyter-Notebook (to view and train_model.ipynb and store_images.ipynb)
- pyautogui
- A good CPU
- A good GPU (not compulsory but recommended)
- Patience. A lot of it......
- If you see the videos you will see that I am wearing a purple thingy in my finger. It is just a white paper painted purple. I am using color segmentation to separate the purple paper from everything in the frame. Indirectly, this means I am tracking the movement of my finger.
- The movement is recorded. The parts where the finger moved is represented by the white lines in the black part of the 'img' window, which I like to call 'blackboard'.
- So the movement of the finger is now converted into a black and white image.
- This black and white image is then processed so that it can be fed to the neural network (which I have trained myself) for prediction. The neural network has an acuuracy of ~94%.
- Based on the prediction a specific action is taken.
-
train_model.ipynb - This is a ipython notebook so you need jupyter-notebook installed to use this file. Use this file if you want to retrain the model.
-
store_images.ipynb - This is a ipython notebook so you need jupyter-notebook installed to use this file. Use this file if you want to see how the pictures in the training dataset looks like. The files will be stored in a newly created folder called 'emnist_dataset'.
-
webcam_video_stream.py - This file contains a class called WebcamVideoStream. It's job is to send frames from the webcam source to the program that is calling it. It uses thread to minimize latency when it is called. It is mainly called by the gesture_action_cnn.py file.
-
action.py - This file stores the actions that need to be taken by a specific gesture.
-
range-detector.py - This file is used to set the HSV color range. The easiest way to use it is to put the yellow paper in front of the camera and then slowly increasing the lower parameters(H_MIN, V_MIN, S_MIN) one by one and then slowly decreasing the upper parameters (H_MAX, V_MAX, S_MAX). When the adjusting has been done you will find that only the yellow paper will have a corresponding white patch and rest of the image will be dark.
python range-detector.py -f HSV -w
-
cnn_model_keras21.h5 - This is the trained model.
-
gesture_action_cnn - This is the file that you need to use to run this project. The trained neural network is loaded and then it is used for prediction. It calls the do_action() of action.py to take a specific action for a gesture. It has 3 modes of usage:-
-
Doodle / None - This is the simplest mode. Nothing really happens here. You can create doodles here. This is the default mode.
-
typing - This mode is specially used with some other text editor. Make sure that the text editor is the current focussed window. The letters made by moving your finger is directly typed into the text editor. Press 't' to come to this mode.
-
keyboard_shortcut - This is my favourite mode. So here you make a letter of the English alphabet, and corresponding to the alphabet a keyboard shortcut is emulated if there is any. There are 15 keyboard shortcuts programmed. Press 's' to get into this mode. The shortcuts are discussed later.
python gesture_action_cnn.py
-
-
First set the HSV masking range for the paper that you are wearing in your finger. To do that run this file
python range-detector.py -f HSV -w
The easiest way to use it is to put the yellow paper in front of the camera and then slowly increasing the lower parameters(H_MIN, V_MIN, S_MIN) one by one and then slowly decreasing the upper parameters (H_MAX, V_MAX, S_MAX). When the adjusting has been done you will find that only the yellow paper will have a corresponding white patch and rest of the image will be dark.
-
Now run the gesture_action_cnn file.
python gesture_action_cnn.py
- A = Ctrl + A (Select all)
- C = Ctrl + C (Copy)
- E = Open Explorer
- F = Open Facebook
- L = Win + L (Lock the computer)
- M = Win (Start Menu)
- N = Ctrl + N (New File)
- O = Take a screenshot
- P = Take a photo with a 5 sec delay
- R = Win + R (Open Run dialog)
- S = Ctrl + S (Save file)
- T = Ctrl + Shift + T (Open Task Manager)
- V = Ctrl + V (Paste)
- X = Ctrl + X (Cut)
- Z = Ctrl + Z (Undo)