A PyTorch implementation of the Selective Listening by Synchronizing Speech with Lips
- A new version of this code is scheduled to be released here (ClearVoice repo).
- The dataset can be found here.
/data
: Scripts to pre-process the voxceleb2 dataset.
/pretrained_slsyn
: The pre-trained SLSyn network to extract embeddings.
/src
: The training scripts of the slsyn network and the reentry model.
If you need the pretrained SLSyn network weights, please email me using your organizational email id. My email address is pan_zexu at u.nus.edu