Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mesd dataset #8

Merged
merged 2 commits into from
Jun 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
***Spoken Emotion Recognition Datasets:*** *A collection of datasets (count=39) for the purpose of emotion recognition/detection in speech.
***Spoken Emotion Recognition Datasets:*** *A collection of datasets (count=40) for the purpose of emotion recognition/detection in speech.
The table is chronologically ordered and includes a description of the content of each dataset along with the emotions included.*

| <sub>Dataset</sub> | <sub>Year</sub> | <sub>Content</sub> | <sub>Emotions</sub> | <sub>Format</sub> | <sub>Size</sub> | <sub>Language</sub> | <sub>Paper</sub> | <sub>Access</sub> | <sub>License</sub> |
|---------------------------------------------------------------------------------------------------|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|---------------------|-------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|----------------------------------------------------------------------------------------------|
| <sub>[MESD]</sub> | <sub>2022</sub> | <sub>864 audio files of single-word emotional utterances with Mexican cultural shaping.</sub> | <sub>6 emotions provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness.</sub> | <sub>Audio</sub> | <sub>86 MB</sub> | <sub>Spanish (Mexican)</sub> | <sub>[The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning]</sub> | <sub>Open</sub> | <sub>[CC BY 4.0]</sub> |
|<sub>[ASVP-ESD]</sub> | <sub>2021</sub> | <sub>~13285 audio files collected from movies, tv shows and youtube containing speech and non-speech.</sub> | <sub>12 different natural emotions (boredom, neutral, happiness, sadness, anger, fear, surprise, disgust, excitement, pleasure, pain, disappointment) with 2 levels of intensity. </sub> | <sub>Audio</sub> | <sub> 2 GB </sub> | <sub>Chinese, English, French, Russian and others<sub> | <sub>--<sub> | <sub> Open access<sub> | <sub>Unknown</sub> |
| <sub>[ESD]</sub> | <sub>2021</sub> | <sub>29 hours, 3500 sentences, by 10 native English speakers and 10 native Chinese speakers.</sub> | <sub>5 emotions: angry, happy, neutral, sad, and surprise.</sub> | <sub>Audio, Text</sub> | <sub> 2.4 GB (zip) </sub> | <sub> English, Chinese </sub> | <sub>[Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset]</sub> | <sub>Open access</sub> | <sub>Available under an Academic License </sub> |
| <sub>[MuSe-CAR]</sub> | <sub>2021</sub> | <sub>40 hours, 6,000+ recordings of 25,000+ sentences by 70+ English speakers (see db link for details).</sub> | <sub>continuous emotion dimensions characterized using valence, arousal, and trustworthiness.</sub> | <sub>Audio, Video, Text</sub> | <sub> 15 GB </sub> | <sub> English </sub> | <sub>[The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements]</sub> | <sub>Restricted access</sub> | <sub>Available under an Academic License & Commercial License </sub> |
Expand Down Expand Up @@ -71,6 +72,7 @@ The table is chronologically ordered and includes a description of the content o

[//]: # (datasets)

[MESD]: https://data.mendeley.com/datasets/cy34mh68j9/5
[ASVP-ESD]: https://www.kaggle.com/datasets/dejolilandry/asvpesdspeech-nonspeech-emotional-utterances
[ESD]: https://hltsingapore.github.io/ESD/
[MuSe-CAR]: https://zenodo.org/record/4134758
Expand Down Expand Up @@ -128,6 +130,7 @@ The table is chronologically ordered and includes a description of the content o

[//]: # (papers)

[The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning]: https://pubmed.ncbi.nlm.nih.gov/34891601/
[Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset]: https://arxiv.org/pdf/2010.14794.pdf
[The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements]: https://arxiv.org/pdf/2101.06053.pdf
[The MSP-Conversation Corpus]: http://www.interspeech2020.org/index.php?m=content&c=index&a=show&catid=290&id=684
Expand Down
2 changes: 2 additions & 0 deletions src/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ However, we cannot guarantee that all listed links are up-to-date. Read more in

.. datasets

.. _`MESD`: https://data.mendeley.com/datasets/cy34mh68j9/5
.. _`ASVP-ESD`: https://www.kaggle.com/datasets/dejolilandry/asvpesdspeech-nonspeech-emotional-utterances
.. _`ESD`: https://hltsingapore.github.io/ESD/
.. _`MuSe-CAR`: https://zenodo.org/record/4134758
Expand Down Expand Up @@ -102,6 +103,7 @@ However, we cannot guarantee that all listed links are up-to-date. Read more in

.. papers

.. _`The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning`: https://pubmed.ncbi.nlm.nih.gov/34891601/
.. _`Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset`: https://arxiv.org/pdf/2010.14794.pdf
.. _`The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements`: https://arxiv.org/pdf/2101.06053.pdf
.. _`The MSP-Conversation Corpus`: http://www.interspeech2020.org/index.php?m=content&c=index&a=show&catid=290&id=684
Expand Down
1 change: 1 addition & 0 deletions src/ser-datasets.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"Dataset","Year","Content","Emotions","Format","Size","Language","Paper","Access","License"
"`MESD`_","2022","864 audio files of single-word emotional utterances with Mexican cultural shaping.","6 emotions provides single-word utterances for anger, disgust, fear, happiness, neutral, and sadness.","Audio","86 MB","Spanish (Mexican)","`The Mexican Emotional Speech Database (MESD): elaboration and assessment based on machine learning`_","Open","`CC BY 4.0`_"
"`ASVP-ESD`_","2021","~13285 audio files collected from movies, tv shows and youtube containing speech and non-speech.","12 different natural emotions (boredom, neutral, happiness, sadness, anger, fear, surprise, disgust, excitement, pleasure, pain, disappointment) with 2 levels of intensity.","Audio","2 GB","Chinese, English, French, Russian and others","--","Open","Unknown"
"`ESD`_","2021","29 hours, 3500 sentences, by 10 native English speakers and 10 native Chinese speakers.","5 emotions: angry, happy, neutral, sad, and surprise.","Audio, Text","2.4 GB (zip)","English, Chinese","`Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset`_","Open","Academic License"
"`MuSe-CAR`_","2021","40 hours, 6,000+ recordings of 25,000+ sentences by 70+ English speakers (see db link for details).","continuous emotion dimensions characterized using valence, arousal, and trustworthiness.","Audio, Video, Text","15 GB","English","`The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements`_","Restricted","Academic License & Commercial License"
Expand Down