The Effects of Fine-Tuning on the ASR Performance of Dialectal Arabic

This repository contains the code for training the models evaluated in the thesis as well as the results and plotting.

Setting up environment

To get started, please install the requirements on a Python 3.10 environment. An example using conda:

conda create -n venv python=3.10
conda activate venv
pip install -r requirements.txt

Instructions for usage

Experiments

The training/experiment_*.py files expect datasets to be available. Please check out the files before trying to run them. Example uses are displayed below.

usage: experiment_dialect.py [-h] -d DIALECT

options:
  -h, --help            show this help message and exit
  -d DIALECT, --dialect DIALECT
                        all, egyptian, gulf, iraqi, levantine, maghrebi

usage: experiment_finetune.py [-h] -d DIALECT

options:
  -h, --help            show this help message and exit
  -d DIALECT, --dialect DIALECT
                        all, egyptian, gulf, iraqi, levantine, maghrebi

usage: experiment_msa.py [-h] -t TRAIN_SIZE

options:
  -h, --help            show this help message and exit
  -t TRAIN_SIZE, --train_size TRAIN_SIZE
                        Train size between 0 and 1

Evaluation

Evaluation can be done with both the training/evaluate_all.py and training/evaluate_whisper*.py files, with the latter being a manual input of the model checkpoint and only evaluating on MSA. training/evaluate_all.py evaluates on all test sets:

usage: evaluate_all.py [-h] -c CHECKPOINT

options:
  -h, --help            show this help message and exit
  -c CHECKPOINT, --checkpoint CHECKPOINT

Results

The results can be found in results/ as well as the Jupyter notebooks required for recreation of the plots in the thesis. results/training_plots.ipynb plots the training processes, while results/results.ipynb plots the final results. The plots can also be found in results/plots/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
plotting		plotting
training		training
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Effects of Fine-Tuning on the ASR Performance of Dialectal Arabic

Setting up environment

Instructions for usage

Experiments

Evaluation

Results

About

Releases

Packages

Languages

O-T-O-Z/finetune-ar-dialects

Folders and files

Latest commit

History

Repository files navigation

The Effects of Fine-Tuning on the ASR Performance of Dialectal Arabic

Setting up environment

Instructions for usage

Experiments

Evaluation

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages