Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about voice speaker Style #12

Open
MavisHoot opened this issue Apr 27, 2024 · 2 comments
Open

Question about voice speaker Style #12

MavisHoot opened this issue Apr 27, 2024 · 2 comments

Comments

@MavisHoot
Copy link

I am trying to train a TTS but I am wondering about the style of the speakers? My dataset contains multiple speakers with different speaking styles. Does the model retain the style for each voice or it uses only one style or it depends on the refer audio. For example In my dataset it contains Indian speaker who pauses nervously in conversation. When i train it with all the dataset and use one audio from that speaker and infer will it inhabit the nervous speaking style? Please I dearly wait for your response and thanks for this great repo

@MavisHoot
Copy link
Author

Please better to ask the question was can you train it with a narrator and conversational voice and get the two speaking style or I will need to train separate models to achieve that?

@KdaiP
Copy link
Owner

KdaiP commented Sep 13, 2024

Hi, in StableTTS, the Mel spectrogram (with a time length of t) is compressed into a global condition embedding with a time length of 1. By visualizing this embedding, you can observe that it clusters according to the speaker ID, meaning the model retains some speaker-specific characteristics. However, the embedding also contains other features, such as emotion, since these traits are not explicitly disentangled in the current setup.

If you're looking for better control over the emotional aspect of the generated speech, I would recommend checking out these papers for more advanced approaches:

DC CoMixTTS
DiCLET-TTS

I hope this helps! Let me know if you have any further questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants