The Deep Noise Suppression (DNS) Challenge is a single-channel speech enhancement challenge organized by Microsoft, with a focus on real-time applications. More info can be found on the official page.
This is made to make your life simpler and research easier !
- Install
git-lfs
without root (required to download the data). - Download the data from the official repo.
- Create the dataset with default parameters.
- Ready-to-use
DataLoader
to train your net with. - Example scripts with all the ingredients for a successful system
- MutliGPU support / Logging (+ Tensorboard) / LR scheduler (thanks lightning!)
- Some new architectures to outperform our model
- Fancy loss functions to improve speech quality
- All the research, all the fun !
- Need to install a python environment? Check this out !
- Open
run.sh
, changestorage_dir
to a path where you can afford storing 320GB of data. - Just
./run.sh
and it's on.
- After the first execution, you can go and change
stage=4
inrun.sh
to avoid redoing all the steps everytime. - To use GPUs for training, run
run.sh --id 0,1
where0
and1
are the GPUs you want to use, training will automatically take advantage of both GPUs. - By default, a random id is generated for each run, you can also add a
tag
to name the experiments how you want. For examplerun.sh --tag with_cool_loss
will save all results toexp/train_dns_with_cool_loss
. You'll also find the corresponding log file inlogs/train_dns_with_cool_loss.log
. - If you want to change the data generation config, go the
storage_dir
, change the noisyspeech_synthesizer.cfg accordingly and restart from stage 2. Be aware that this will overwrite the previous json files indata/
.
The data download, dataset creation and preprocessing will take a while (around a day in my case). From stage 4 (training), be sure you have enough compute power to train your DNN. Before that, you're I/O bound so not much compute power is needed.
- The challenge paper, here.
@misc{DNSChallenge2020,
title={The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework},
author={Chandan K. A. Reddy and Ebrahim Beyrami and Harishchandra Dubey and Vishak Gopal and Roger Cheng and Ross Cutler and Sergiy Matusevych and Robert Aichner and Ashkan Aazami and Sebastian Braun and Puneet Rana and Sriram Srinivasan and Johannes Gehrke}, year={2020},
eprint={2001.08662},
}
- The baseline paper, here.
@misc{xia2020weighted,
title={Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement},
author={Yangyang Xia and Sebastian Braun and Chandan K. A. Reddy and Harishchandra Dubey and Ross Cutler and Ivan Tashev},
year={2020},
eprint={2001.10601},
}