Deep Learning based Speech emotion recognition.
- DATASETS COLLECTION:
HERE, THE DATASETS ARE COLLECTED IN THE KAGGLE WEBSITE IN TERMS OF AUDION FILE OF CREMA DATASETS. HERE, THE SIX TYPES OF EMOTION ARE CATEOGORIZES,
SAD - sadness; ANG - angry; DIS - disgust; FEA - fear; HAP - happy; NEU - neutral.
- DATA PREPROCESSING
DATA PREPROCESSING METHOD ARE USED TO PREPROCESS THE DATA OR NORMALIZE THE DATA INTO THE PROPER FORMAT OF THE SYSTEM.
- DATA VISUALIZATION
HERE, WE USE THE LIBROSA BASED WEVESHOW PLOT TO VISUALIZE THE AUDIO DATA INTERMS OF SIX CATEOGORIZES OF EMOTION.
- FEATURE EXTRACTION
HERE, WE USE THE LIBROSA METHOD TO EXTRACT THE FEATURES IN TERMS OF CHROMA_STFT, MFSS, MELSPECTOGRAM AND TONNETZ OF EACH AND EVERY EMOTION OF SPEECH DATASETS.
- MODEL IMPLEMEMENTATION
HERE, WE USE THE DEEP LEARNING METHOD TO TRAIN THE AUDIO DATA INTERMS OF SPEECH OF THE SYSTEM.
HERE, WE USE THE CNN(CONVOLUTIONAL NEURAL NETWORK) TO SHOW THE RESULT OF THE SYSTEM.
- FINAL PREDICTION
FINAL PREDICTION BASED ON ACCURACY AS WELL AS THE CLASSIFICATION REPORT AS WELL AS THE LOSS AND ACCURACY GRAPH OF THE ALGORITHM.