Skip to content

Text to Speech with PyTorch (English and Mongolian)

License

Notifications You must be signed in to change notification settings

bitsoft-maax/pytorch-dc-tts

 
 

Repository files navigation

PyTorch implementation of Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention based partially on the following projects:

Online Text-To-Speech Demo

The following notebooks are executable on https://colab.research.google.com :

For audio samples and pretrained models, visit the above notebook links.

Training/Synthesizing English Text-To-Speech

The English TTS uses the LJ-Speech dataset.

  1. Download the dataset: python dl_and_preprop_dataset.py --dataset=ljspeech
  2. Train the Text2Mel model: python train-text2mel.py --dataset=ljspeech
  3. Train the SSRN model: python train-ssrn.py --dataset=ljspeech
  4. Synthesize sentences: python synthesize.py --dataset=ljspeech
    • The WAV files are saved in the samples folder.

Training/Synthesizing Mongolian Text-To-Speech

The Mongolian text-to-speech uses 5 hours audio from the Mongolian Bible.

  1. Download the dataset: python dl_and_preprop_dataset.py --dataset=mbspeech
  2. Train the Text2Mel model: python train-text2mel.py --dataset=mbspeech
  3. Train the SSRN model: python train-ssrn.py --dataset=mbspeech
  4. Synthesize sentences: python synthesize.py --dataset=mbspeech
    • The WAV files are saved in the samples folder.

About

Text to Speech with PyTorch (English and Mongolian)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.2%
  • Python 0.8%