Skip to content

efficient incremental inference algorithm for fastspeech2 in espnet2

Notifications You must be signed in to change notification settings

Dan1chu/incremental_text_to_speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

Incremental text-to-speech

A python implementation of incremental text-to-speech using fastspeech2.

How is it different from previous-work?

  1. It uses a non-auto-regressive text-to-speech model. (tacotron, transformer-tts -> fastspeech2 )
  2. It uses a simple context discard algorithm for speed-up.

How to use?

  1. Download pretrained tts+vocoder from https://zenodo.org/record/5498896
  2. Unzip the file.
  3. Place the unzipped files like this:
incremental_tts
├── exp 
|    ├── stats
│    │    ├── train
│    |    ├── energy_stats.npz
│    |    ├── energy_stats.npz
│    |    └── energy_stats.npz
│    └── tts
│         ├── config.yaml
│         └── train.total_count.ave_10best.pth
├── gan_tts.py 
└── incremental_tts.py 

  1. Install anaconda.
  2. Make anaconda environments.(recommanded python version -> 3.7.4)
  3. Install all python requirements in anaconda enviroments.
  • torch (cuda version, no cpu-only version)
  • numpy
  • espnet2
  • pyaudio
  1. just type and use. -> python incremental_tts.py

About

efficient incremental inference algorithm for fastspeech2 in espnet2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages