OpenSMILE "world-class audio analysis toolkit"
This python script runs over all the .wav files in a folder and creates CSV file that contains columns with all the feature according to the configuration file.
Note: The script is based on SMILExtract executable file that was compiling on Ubuntu 18.04.4 LTS.
The SMILExtract doesn't recognize wav header written with librosa to overcome this problems all files are re-written using pydub lib
Using the .conf files in this repo will get CSV that contains:
- The wav file name
- Frame index
- Frame time
- All the features defined in the configuration flie
You can create your own configuration file following the openSMILE-latest-book from the official site
Other good information for making configuration files can be found in:
https://stackoverflow.com/questions/43555779/how-to-create-custom-config-files-in-opensmile
When using emobase_full_frame_2_csv.conf the files head :
name | pcm_intensity_sma_variance | pcm_intensity_sma_stdd |
---|---|---|
'example-audio/media-interpretation.wav' | 5.729106e-12 | 2.393555e-06 |
'example-audio/opensmile.wav' | 1.634973e-11 | 4.043480e-06 |
When using emobase_25ms_frames_2_csv.conf the files head :
name | frameIndex | frameTime | pcm_intensity_sma_variance |
---|---|---|---|
'example-audio/media-interpretation.wav' | 0 | 0.012500 | 2.828065e-21 |
'example-audio/media-interpretation.wav' | 1 | 0.012500 | 9.548435e-21 |
'example-audio/media-interpretation.wav' | 2 | 0.020006 | 5.578462e-21 |
'example-audio/media-interpretation.wav' | 3 | 0.030006 | 1.493910e-21 |