pip install -r requirements.txt
- Model choice: ResNet18 trained on train.part1 with BCE loss
cd <directory/with/train.py>
-
python train.py --train_dataset_path <path/to/train1/train> --val_dataset_path <path/to/val/val>
-
python test.py --dataset_path <path/to/val/val>
-
- Metrics achieved on dataset train.part2: Accuracy = 0.99, Precision = 0.99, Recall = 0.99
- Model choice: custom autoencoder trained on train.part1 with L1 loss
cd <directory/with/train.py>
-
python train.py --train_dataset_path <path/to/train1/train> --val_dataset_path <path/to/val/val>
-
python test.py --dataset_path <path/to/val/val>
-
- Metrics achieved on dataset train.part2: MSE = 0.236
- SwinUNet transformer for image denoising
- Swin Transformer for image restoration
- Convert mel-spectrograms to audio arrays with known construction parameters, such as sampling rate. Then apply something like Speech denoising WaveNet to remove noise