Text to Speech with FastSpeech2

FastSpeech2 article and FastSpeech article.

Example

Inference result is audio, but Github supports only video+audio formats.

0005-audio-l1-p1-e1.mov

You can also download a folder with tts-results from Google Drive, it includes 27 audios with different length, pitch and energy for the first three inputs from test_model/input.txt.

Installation guide

Use python3.9

conda create -n fastspeech2 python=3.9 && conda activate fastspeech2

Install libraries

pip3 install -r requirements.txt

Download data

bash scripts/download_data.sh

Preprocess data: save pitch and energy

python3 scripts/preprocess_data.py

Download my final FastSpeech2 checkpoint

python3 scripts/download_checkpoint.py

Train

Run for training

python3 train.py -c configs/train.json

Final model was trained with train.json config.

Test

Run for testing

python3 test.py

test.py include such arguments:

Config path: -c, --config, default="configs/test.json"
Create multiple audio variants with different length, pitch and energy -t, --test, default=False
Increase or decrease audio speed: -l, --length-control, default=1
Increase or decrease audio pitch: -p, --pitch-control, default=1
Increase or decrease audio energy: -e, --energy-control, default=1
Checkpoint path: -cp, --checkpoint, default="test_model/tts-checkpoint.pth"
Input texts path: -i, --input, test_model/input.txt
Waveglow weights path: -w, --waveglow, default="waveglow/pretrained_model/waveglow_256channels.pt"

Results will be saved in the test_model/results, you can see example in this folder.

Wandb Report

https://api.wandb.ai/links/tgritsaev/rkir8sp9 (English only)

Credits

This repository is based on a heavily modified fork of pytorch-template repository. FastSpeech2 impementation is based on the code from HSE "Deep Learning in Audio" course seminar and official FastSpeech2 repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text to Speech with FastSpeech2

Example

Installation guide

Train

Test

Wandb Report

Credits

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
configs		configs
scripts		scripts
src		src
test_model		test_model
waveglow		waveglow
.gitignore		.gitignore
README.md		README.md
glow.py		glow.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

tgritsaev/fastspeech2

Folders and files

Latest commit

History

Repository files navigation

Text to Speech with FastSpeech2

Example

Installation guide

Train

Test

Wandb Report

Credits

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages