Releases: intunist/nnsvs-english-support
v0.10.0: Proper BrE support added to public HED files
Internal HED files supported BrE correctly for over a year, this is now available publicly.
- HED is now separated into two variants
- AR HED is for AR and Diffusion training, it is smaller and does not affeect quality.
- Sinsy HED is for sinsy-esque models (FFConvLSTM, Conv1D, etc). We found it gives better quality for these models.
- Romaji removed from training dictionary to avoid conflicts.
- British (and Australian/International) English Support is now available in the HED file. Please refer to our new guide and phoneme docs for reference.
- New documentation is available regarding English labeling.
- Duration and Timelag training is improved thanks to the new HED formatting (this took many attempts)
v0.9.1: Minor Fix for Supplimentary Japanese
Minor adjustment, changing [t][s] to [th] for Japanese.
Refer to v0.9.0 for rest of update notes.
v0.9.0: Supplimentary Japanese
Due to demand, support for Japanese has been added to the dictionary. Follow the readme for the phoneme selection.
[sil] has been removed from the HED, please use [pau] exclusively.
v0.8.1: optimized VUV values
Corrected based on client training.
This releases corrects the HED for better handling of VUV and also results in a clearer resulting model when training.
v0.8.0: Now managed by Intunist
Intunist takeover of English support maintenance,
Removed several useless tone flags. Prepare for further changes.
Full Changelog: v0.7.1...v0.8.0
v0.7.1: HED Update
Updated HED file. Please update your in_dim settings.
v0.7.0: [br] Removal, "correct_vuv_by_phone" support.
v0.7.0 brings huge improvements to the quality of resulting models. Using a new VUV correction feature of NNSVS and removing the flawed [br] phoneme.
You will need to update your dataset for this version.
-
'correct_vuv_by_phone' allows us to assign a specify the desired VUV value for specific phonemes to prevent VUV errors.
It does not require any changes to your dataset. It will take effect automatically as long as "force_fix_vuv" is true and you use the latest HED file. -
Support for the [br] phoneme is removed in this version due to a DRASTIC reduction in quality and stability for models which utilized it.
It served no practical purpose as NNSVS can handle breaths automatically as part of the already existing [pau] phoneme.
You will need to update your dataset as the phoneme is no longer provided. Remove [br] and replace it with [pau]. ONE [pau] phoneme for a single section of silence.
While we would have gladly kept the [br] phoneme, it's negative effects could not. After much testing it was decided to remove it. Thanks to the individuals who provided their datasets to confirm.
v0.6.0: ENUNU flag support, phoneme update, cleanup.
This update adds support for ENUNU's flag support. Allowing a model to swap timbre based on user input.
The only phonetic change is replacing [ol] with [trash] to bring it in line with DYVAUX's romance language support.
Beyond that, this update only adds some general cleanup and shouldn't require dataset changes in most cases.
v0.5.0: [cs] to [cl]. Different yet Same.
In the last release [cl] was changed to [ct]. Now, in this release, [cs] is being changed to [cl].
[cl] now works the about same as it does for Japanese labeling. This should rectify all confusion users were having.
[cl] - For when a stop/popped consonant is held for an abnormally long time. to sustain the silence/closure.
[ct] - For toggling the closure state of the consonant. To override context. (refer to LAB_HOWTO.txt for an explanation.)
Both phonemes are technically optional and a proper dataset can be created without using either. They just provide more control for weird edge cases.
v0.4.1: closure toggle, to fix some confusion
This is a minor release to correct some documentation and a minor phoneme change.
[cl] was always used as a "toggle" for closures and not meant to always be used to labeling closures. It has been replaced with the [ct] phoneme to avoid confusion.
If you used [cl] before, you may need to remove and correct the usage.
Please refer to the LAB_HOWTO.txt file for correct usage.
Keep in mind the [ct] is an optional phoneme and you can go without using it in a dataset. You'll lose some control but the overall result will be about the same.
the [cl] phoneme is removed in this release. Thank you.