-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Voice to semantic #19
Comments
The dataset creation code is up at https://github.com/gitmylo/bark-data-gen To get the semantics from a voice, you have to use a trained HuBERT quantizer model. See a problem? It cannot be improved for a specific voice, because all you could train on, is previous outputs. To understand why it works, you need to understand how bark works. https://github.com/gitmylo/audio-webui/wiki/how-bark-works |
Dear gitmylo, I also want to know how to create semantic data from wav source files. |
If you want to train, you'll need a text dataset in the language you want to train for, you can modify the bark-data-gen code to load text files in another language for example. Then prepare the dataset, and train, as explained in https://github.com/gitmylo/bark-voice-cloning-HuBERT-quantizer#how-do-i-train-it-myself. And just follow the other steps. |
If I well understood, you used a custom semantic-voice dataset for training your HuBERT model. Can you tell me how to create this dataset? Especially how to get the semantic from a voice? Many thanks for this work.
The text was updated successfully, but these errors were encountered: