A tech demo of MXNet capabilities consisting of a Tacotron implementation. This is a work in progress.
This project was made during the 8 weeks from 10-2017 to 12-2017 at the PiCampus AI School in Rome.
- Multithreading data iterator
- DSP tools
- CBHG module for spectrograms
- Basic seq2seq example for string reverse. It we'll be used as Tacotron backbone
- Encoder with CBHG
- Attention model
- Custom decoder for processing r * mel_bands spectrograms frames for each time step during the cell unrolling
- Switch to MXNet 1.0
- Switch to Gluon
- Clean up and organize code for better understanding
- install MXNet:
pip install -r requirements.txt
- run:
python tacotron.py
Using the default setting, a simple dataset will be used as training. Predictions samples will be generated at the end of the training phase.
If you want to train over a big dataset, Kyubyong has cut and formatted this English bible. You can find his dataset here and the CSV text here .
This project has been developed on
- MXNet 0.12
- librosa
This project was developed by Alberto Massidda and Stefano Artuso during Pi School's AI programme in Fall 2017.
- Thanks to Roberto Barra Chicote for supporting us
- Thanks to Keith Ito https://github.com/keithito, Kyubyong Park https://github.com/Kyubyong for making us start diving in