Speech Emotion Recognition using Keras, Tensorflow and sklearn

This repository contains the code for the speech emotion recognition model built using Keras, TensorFlow, and sklearn libraries. The data used for training purposes is the RAVDESS Audio Dataset.

Tested on Python 3.8.10

Installation

To install the dependencies run:

pip install -r requirements.txt

Ryerson Audio-Visual Database of Emotional Speech and Song

The RAVDESS audio dataset consists of two lexically matched statements vocalized in a neutral North American accent by 24 professional actors (12 female, 12 male) in the database. Calm, happy, sad, angry, afraid, surprised, and disgusted expressions can be found in speech, whereas calm, happy, sad, angry, and fearful emotions can be found in song. Each expression has two emotional intensity levels (normal and strong), as well as a neutral expression. We will only be making use of audio files in this project.

The audio files of RAVDESS are selected and placed inside the data folder.

Extracting features from audio files

To extract the data, run:

python utils.py

Train the model

To train the model, run:

python model.py

Use the model to make predictions

To get predictions, run: cd src

python engine.py --framework=(keras/sklearn) --infer --infer-file-path="filepath of the sample to make predictions"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Speech Emotion Recognition using Keras, Tensorflow and sklearn

Installation

Ryerson Audio-Visual Database of Emotional Speech and Song

Extracting features from audio files

Train the model

Use the model to make predictions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Speech Emotion Recognition using Keras, Tensorflow and sklearn

Installation

Ryerson Audio-Visual Database of Emotional Speech and Song

Extracting features from audio files

Train the model

Use the model to make predictions