text_to_wav_manual

Text to wav manual

[TOC]

Fetch language resources

Install git-lfs Git Large File Storage
Clone resources repository

git clone http://www.modelscope.cn/damo/speech_sambert-hifigan_tts_zhitian_emo_zh-cn_16k.git

.
├── configuration.json
├── description
├── README.md
├── resource
├── resource.zip      <----- dependency resources
└── voices.zip

The resource.zip will be used in the next step.

Text to wav

Make sure you have well trained AM&Voc models and configuration files, and the text input file like the following:

徐玠诡谲多智，善揣摩，知道徐知询不可辅佐，掌握着他的短处以归附徐知诰。
许乐夫生于山东省临朐县杨善镇大辛庄，毕业于抗大一分校。
宣统元年（1909年），顺德绅士冯国材在香山大黄圃成立安洲农务分会，管辖东海十六沙，冯国材任总理。
学生们大多住在校区宿舍，通过参加不同的体育文化俱乐部及社交活动，形成一个友谊长存的社会圈。
学校的“三节一会”（艺术节、社团节、科技节、运动会）是显示青春才华的盛大活动。
雪是先天自闭症患者，不懂与人沟通，却拥有灵敏听觉，而且对复杂动作过目不忘。
勋章通过一柱状螺孔和螺钉附着在衣物上。
雅恩雷根斯堡足球俱乐部（）是一家位于德国雷根斯堡的足球俱乐部，处于德国足球丙级联赛。
亚历山大·格罗滕迪克于1957年证明了一个深远的推广，现在叫做格罗滕迪克–黎曼–罗赫定理。
...
...

Run the command below to generate wav files:

CUDA_VISIBLE_DEVICES=0 python kantts/bin/text_to_wav.py --txt TEXT_FILE_PATH --output_dir OUTPUT_DIR --res_zip RESOURCE_ZIPFILE_PATH --am_ckpt AM_CKPT_PATH --voc_ckpt VOC_CKPT_PATH --speaker YOUR_SPEAKER_NAME

Then get the wav files in the OUTPUT_DIR directory.

.
├── 0_0_mel_gen.wav
├── 0_1_mel_gen.wav
├── 0_2_mel_gen.wav
├── 1_0_mel_gen.wav
├── 1_1_mel_gen.wav
├── 1_2_mel_gen.wav
├── 2_0_mel_gen.wav
├── 2_1_mel_gen.wav
├── 2_2_mel_gen.wav
├── 3_0_mel_gen.wav
├── 3_1_mel_gen.wav
├── ...
├── feat
├── stdout.log
└── symbols.lst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text_to_wav_manual

Text to wav manual

Fetch language resources

Text to wav

Clone this wiki locally