This repository modularizes the familiarization with classical, non-ML/AI classifiers using user-provided data, accepting audio recordings of speech and videos of human faces.
- Feature extraction: choose 1 of 2 available data modalities and provide input files to their respective characterization (feature extraction) script.
- Optional dimensionality reduction: apply PCA (principle component analysis) and/or LDA (linear discriminant analysis) to the extracted features.
- Classification: select reduced or original features and perform classification with cross validation (CV) and grid search, with various fusion strategies available.
- audio: DisVoice; extracts multiple types of speech features from audio input. (details & examples)
- video: Face Mesh & HOG descriptors; locates facial landmarks and define facial region. (details & examples)
Handled by dim_reduce.py
. For individual argument description, run python dim_reduce.py --help
- Input: manually specify the path to the outputted features of 1 of the modalities
- Configurations: option to choose 1 of PCA or LDA; specification of feature type to preserve from original extraction. option of early fusion
- Output: appropriately selected and simplified features. Whether any configurations are set, output matches default format of classification input.
Run python classification.py --help
for details
- Default input: features file formatted by
dim_reduce.py
- Custom input: manually select original features extracted by 1 modality (omit step 2)
- Configurations: option of classifier late fusion, feature late fusion, and probability fusion
- Output (stored and overwritten in
classification_output
directory):- classification scores, as both mean & stdev and raw scores across CV folds
- graphic of class probabilities by classifiers
- graphic of train & test split and correctness within feature space, if 2-dimensional.
Both the audio and video features require specific manually installed dependencies. Please refer to their details linked above.
- disvoice (which requires specific versions of the following)
- python=3.10
- numpy=1.22.4
- scipy=1.11.4
- opencv-python
- matplotlib
- mediapipe
- pandas
- scikit-image
- scikit-learn