Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use my own language IPA for getting perfect model ? #590

Open
4 tasks done
RifatMamayusupov opened this issue Dec 5, 2024 · 3 comments
Open
4 tasks done

Can I use my own language IPA for getting perfect model ? #590

RifatMamayusupov opened this issue Dec 5, 2024 · 3 comments
Labels
question Further information is requested

Comments

@RifatMamayusupov
Copy link

Checks

  • This template is only for question, not feature requests or bug reports.
  • I have thoroughly reviewed the project documentation and read the related paper(s).
  • I have searched for existing issues, including closed ones, no similar questions.
  • I confirm that I am using English to submit this report in order to facilitate communication.

Question details

I trained my dataset for about 40 hours with a single speaker using F5 for 35 epochs. My model synthesizes short words well, but when synthesizing longer text, it produces speech with a different accent. For example, Uzbek is similar to Turkish, so sometimes my model synthesizes Uzbek text with a Turkish accent. Additionally, there is some noise at the end of the output audio.

How can I resolve this issue?

@RifatMamayusupov RifatMamayusupov added the question Further information is requested label Dec 5, 2024
@Mustaphajudi
Copy link

Bigger dataset,and longer audio samples

@RifatMamayusupov
Copy link
Author

@Mustaphajudi can you tell me length of audio ? such as 10s or 15 s ?
and total 50 hours dataset is enough ?

@RifatMamayusupov RifatMamayusupov changed the title Can I use my own language IP for getting perfect model ? Can I use my own language IPA for getting perfect model ? Dec 11, 2024
@ZhikangNiu
Copy link
Collaborator

@Mustaphajudi can you tell me length of audio ? such as 10s or 15 s ? and total 50 hours dataset is enough ?

I think the setting is ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants