Incremental text-to-speech

A python implementation of incremental text-to-speech using fastspeech2.

How is it different from previous-work?

It uses a non-auto-regressive text-to-speech model. (tacotron, transformer-tts -> fastspeech2 )
It uses a simple context discard algorithm for speed-up.

How to use?

Download pretrained tts+vocoder from https://zenodo.org/record/5498896
Unzip the file.
Place the unzipped files like this:

incremental_tts
├── exp 
|    ├── stats
│    │    ├── train
│    |    ├── energy_stats.npz
│    |    ├── energy_stats.npz
│    |    └── energy_stats.npz
│    └── tts
│         ├── config.yaml
│         └── train.total_count.ave_10best.pth
├── gan_tts.py 
└── incremental_tts.py

Install anaconda.
Make anaconda environments.(recommanded python version -> 3.7.4)
Install all python requirements in anaconda enviroments.

torch (cuda version, no cpu-only version)
numpy
espnet2
pyaudio

just type and use. -> python incremental_tts.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
incremental_tts		incremental_tts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Incremental text-to-speech

How is it different from previous-work?

How to use?

About

Releases

Packages

Languages

Dan1chu/incremental_text_to_speech

Folders and files

Latest commit

History

Repository files navigation

Incremental text-to-speech

How is it different from previous-work?

How to use?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages