Skip to content

TheoLisin/Emotion_Recognition_with_Wav2Vec

Repository files navigation

Emotion recognition in English speech

Motivation

I did this project in order to learn how to work with audio, transformers and pretrained models. The concept of emotion is not strictly defined from the point of view of science, so you should not take this model seriously. Most likely, it works no better than a polygraph or physiognomy (this is pseudoscience).

Keep this in mind and have fun.

Installation

To install on package on Windows use pip install . -f https://download.pytorch.org/whl/torch_stable.html

To install on package on Linux use pip install .

To run the bot, it should be passed the following environment variables:

  • BOT_TOKEN - bot token (from @BotFatherBot)

  • COLLECT_PATH - path to collect voice

  • LOGS_FILE - path to log errors

  • TEMP_PATH -

  • RES_FILE - path to the log file to save the name and emotion

  • MODEL_PATH - saved pre-trained model with classification head (Any of this checkpoints)

  • HF_MODEL - pre-trained model from Huggingface for processor ("jonatasgrosman/wav2vec2-large-xlsr-53-english")

Model

(Training process based on this notebook)

The model consists of pre-trained XLSR-Wav2Vec body and classification head. First, the classifier was trained on clear RAVDESS dataset (wav2vec weights have been frozen), then the entire model was trained on the same data with added random noise.

(Model training notebook will be added soon)

Emotion recognition bot

Bot was made for testing model on Russian speech (or English with Russian accent :) ). It collected some data (from my friends with their permission) for testing.

TOP-2 confussion matrix

TOP-2 confussion matrix

TOP-1 accuracy: 47%

TOP-2 accuracy: 67%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published