-
Notifications
You must be signed in to change notification settings - Fork 4.1k
GPT‐SoVITS‐v2‐features (新特性)
1-v1v2情况对比 (v2 compared with v1)
语种支持(可互相跨语种合成) | GPT训练集时长 | SoVITS训练集时长 | 推理速度 | 参数量 | 文本前端 | 功能 | |
---|---|---|---|---|---|---|---|
v1(1月发布) | 中日英 | 2k小时 | 2k小时 | baseline | 200M | baseline | baseline |
v2 | 中日英韩粤 | 2.5k小时 | vq encoder2k小时,剩余5k小时 | 翻倍 | 不变 | 中日英逻辑均有增强 | 新增语速调节,无参考文本模式,更好的混合语种切分 |
Language Support (Cross-language synthesis) | GPT Training Dataset Duration | SoVITS Training Dataset Duration | Inference Speed | Number of Parameters | Text Frontend | Features | |
---|---|---|---|---|---|---|---|
v1 (Released in January) | Chinese, Japanese, English | 2k hours | 2k hours | baseline | 200M | baseline | baseline |
v2 | Chinese, Japanese, English, Korean, Cantonese | 2.5k hours | vq encoder 2k hours, while the other params 5k hours | doubled | unchanged | Enhanced performance for Chinese, Japanese, and English | Added speed control, reference-free mode, better mixed-language slices |
2-v2模型新特点 (v2 Model New Features)
(1)SoVITS:对低音质参考音频(尤其是来源于网络的高频严重缺失、听着很闷的音频)合成出来音质更好
SoVITS: Improved synthesis quality for low-quality reference audio (especially audio with severe high-frequency loss and muffled sound from the internet).
(2)加大训练集到5k小时,zero shot性能更好音色更像
Increased Training Dataset: Expanded to 5k hours, enhancing zero-shot performance and making the timbre more similar.
(3)增加2个语种,现在可5语种之间互相跨语种合成(跨语种合成,指训练集、参考音频语种和需要合成的语种不同)
Added Two Languages: Now supports cross-language synthesis among five languages (cross-language synthesis means that the training dataset, reference audio language, and the language to be synthesized can all be different).
(4)更好的文本前端:持续迭代更新。v2中英文加入了多音字优化。
Improved Text Frontend: Continuously updated. For v2, Chinese and English have been optimized for polyphonic characters.
3-如何使用v2
(1)可以直接下载7z包,huggingface
(2)或者从v1环境迁移至v2
or you can use v2 from v1 environment:
1.需要pip安装requirements.txt更新环境
1.pip install -r requirements.txt to update some packages
2.需要克隆github上的最新代码
2.clone the latest codes from github
3.需要从huggingface 下载预训练模型文件放到GPT_SoVITS\pretrained_models\gsv-v2final-pretrained下
3.download v2 pretrained models from huggingface and put them into GPT_SoVITS\pretrained_models\gsv-v2final-pretrained
中文额外需要下载G2PWModel_1.1.zip(下载G2PW模型,解压并重命名为G2PWModel,将其放到GPT_SoVITS\text目录下
Chinese v2 additional: G2PWModel_1.1.zip(Download G2PW models, unzip and rename to G2PWModel, and then place them in GPT_SoVITS\text.