Emotion Detection Model

This ared library is a multimodal emotion detection library. You can use it to detect emotion of the speaker using text, audio and visual information.

Model

The weights for the models used in the video can be downloaded form here

The architecture of the model can be seen ibelow

This is the implementation of the paper.

The weighted F1 accuracy of the model on the dataset is 65%. Check the confusion matrix below confusion_matrix

The model is not capable of distinguishing fear, sadness and disgust emotions because the MELD dataset is imbalanced and only a portion of trianing datset contains these classes.

Installation

Download the weights from the link below and place them in the weights folder

Install the library with pip install https://github.com/eliird/ared

The weights folder contains the weight for the following models

vision - scene and face model, use the one depending on if you are using face images or the scene images both finetuned on the MELD dataset

audio - wavernn model trained on sentiment analysis on the MELD dataset

text - weights for gpt-2 trained for ekman emotion classes classification on MELD dataset

mmer - the weights of the cross fusion model from the paper.

You can run example.py which generates the plot explaining the relation between inference time and audio duration. The model was trained on 3 seconds audio duration, the clips with longer than 3 seconds were clipped to be seconds and videos will less than 3 seconds were padded to be 3 seconds. The sampling rate of the audio was 44100.

Will possibly add the link to the noteboook which trains all these models in the future

Demo

You can checkout the demo.py to see how to use this library. You will need to download the asr model to detect the speech from the audio. The notebook downloads the Qwne model but if you want to use another you can replace it with the model of your choice

Usage

Loading the libraries

from ared import EmotionDetector
from ared import ASR
from ared.utils import (
    id2emotion, emotion2id, load_first_50_images, load_audio_from_file
)
import random

random.seed(20)

# paths containing the weights of the model
vis_weights = './weights/vision/MELDSceneNet_best.pt'
audio_weights='./weights/audio/model_best_sentiment.pth'
text_wreights = './weights/text/'

device = 'cuda'

# load the emotion detection model
detector = EmotionDetector(vis_model_weights=vis_weights, 
                           text_model_weights=text_wreights, 
                           audio_model_weights=audio_weights,
                           device=device)

Detecting emotion from a video file

# load the ASR model

asr_model = ASR(device)


video_path = './dia0_utt0.mp4'

utterance = asr_model.convert_speech_to_text(video_path)
emotion, probabilities = detector.detect_emotion(video=video_path, 
                                          audio=video_path, 
                                          text=utterance)
print(emotion, probabilities)

Detecting emotion from the loaded_data

images = load_first_50_images(video_path)
audio = load_audio_from_file(video_path)
utterance = asr_model.convert_speech_to_text(video_path)

emotion, probab = detector.detect_emotion(video=images, audio=audio, text=utterance)
print(emotion, probab)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
ared		ared
images		images
weights/text		weights/text
your_package_name.egg-info		your_package_name.egg-info
.gitignore		.gitignore
AudioDuration_vs_PredictionTime.png		AudioDuration_vs_PredictionTime.png
README.md		README.md
demo.ipynb		demo.ipynb
dia0_utt0.mp4		dia0_utt0.mp4
example.py		example.py
setup.py		setup.py
temp.wav		temp.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emotion Detection Model

Model

Installation

Demo

Usage

About

Releases

Packages

Contributors 2

Languages

eliird/ared

Folders and files

Latest commit

History

Repository files navigation

Emotion Detection Model

Model

Installation

Demo

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages