TinyWASE

Pytorch implementation of our paper [Compressing Speaker Extraction Model with Ultra-low Precision Quantization and Knowledge Distillation].

Getting Started

The training samples are generated by randomly selecting speeches of different speakers from si_tr_s of WSJ0, and mixing them at various signal-to-noise ratios (SNR). The evaluating samples are generated by fixed list ./data/wsj/mix_2_spk_voiceP_tt_WSJ_dellnode.txt. Please modify the dataset paths in ./data/preparedata.py according to your actual paths.

data_config['speechWavFolderList'] = ['FOLDER_TO_SPEECH_FILES']
data_config['spk_test_voiceP_path'] = 'PATH_TO_SPK_TEST_VOICEP_FILE'

We advise you to utilize the pickle files, which could speed up experiments by saving time for frequency resampling. You could modify the pickle paths to anywhere you like.

data_config['train_sample_pickle'] = 'PATH_TO_TRAIN_SAMPLE_PICKLE'
data_config['test_sample_pickle'] = 'PATH_TO_TEST_SAMPLE_PICKLE'

Training

Specify the paths for --restore_teacher, --pretrained and --log in train_tinywase.py and run the bash.

python train_tinywase.py

Evaluation

Specify the paths for --restore, --log in eval_tinywase.py and run the bash.

python eval_tinywase.py

Citations

If you find this repo helpful, please consider citing:

@article{huang2021compress,
  title={Compressing Speaker Extraction Model with Ultra-low Precision Quantization and Knowledge Distillation},
  author={Huang, Yating and Hao, Yunzhe and Xu, Jiaming and Xu, Bo}
}

@inproceedings{hao2021wase,
  title={Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Environments},
  author={Hao, Yunzhe and Xu, Jiaming and Zhang, Peng and Xu, Bo},
  booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={6104--6108},
  year={2021},
  organization={IEEE}
}

@article{hao2020unified,
  title={A Unified Framework for Low-Latency Speaker Extraction in Cocktail Party Environments},
  author={Hao, Yunzhe and Xu, Jiaming and Shi, Jing and Zhang, Peng and Qin, Lei and Xu, Bo},
  journal={Proc. Interspeech 2020},
  pages={1431--1435},
  year={2020}
}

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
models		models
utils		utils
README.md		README.md
anybit.py		anybit.py
config.yaml		config.yaml
eval_tinywase.py		eval_tinywase.py
eval_wase.py		eval_wase.py
train_tinywase.py		train_tinywase.py
train_wase.py		train_wase.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TinyWASE

Getting Started

Training

Evaluation

Citations

License

About

Releases

Packages

Languages

vyouman/TinyWASE

Folders and files

Latest commit

History

Repository files navigation

TinyWASE

Getting Started

Training

Evaluation

Citations

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages