Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for F5-small models, sharing a Hindi checkpoint trained from scratch and adding NFE steps in inference scripts #534

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rumourscape
Copy link

In this PR I am contributing three things as given in title.

@thesandi99
Copy link

Oh hindi is nice how much hour you train model , becuas im train 60H and is not result like your's model and wht setting

@rumourscape
Copy link
Author

Oh hindi is nice how much hour you train model , becuas im train 60H and is not result like your's model and wht setting

I have trained the small config model with about 80 hrs of data for 2.5 million steps.

@SWivid
Copy link
Owner

SWivid commented Nov 27, 2024

Thanks a lot!
Will check after #518 , which enables using yaml for better config

@SWivid SWivid requested review from SWivid and ZhikangNiu November 27, 2024 20:38
@Aunali321
Copy link

@rumourscape Hi, Can you share training code for this model?

@justinjohn0306
Copy link
Contributor

I tested the model and it sounds trash

@thesandi99
Copy link

thesandi99 commented Nov 28, 2024

I tested the model and it sounds trash

actually is better compare to 80H of training

@rumourscape
Copy link
Author

I tested the model and it sounds trash

You clearly haven't used it properly then. The ref_audio must be in Hindi and the ref_text & gen_text must be in Devanagari script. I have showed this model to many colleagues and they all believe that this is probably the current SOTA for Hindi TTS.

@rumourscape
Copy link
Author

rumourscape commented Nov 28, 2024

@rumourscape Hi, Can you share training code for this model?

I have used the same training script as given in the F5 repo. The only change I made was to skip the preprocess step and directly plug in the Huggingface datasets using the HFDataset class.
PS: I also avoided the convert_char_to_pinyin function to prevent extra spaces in between characters.

@Aunali321
Copy link

Aunali321 commented Nov 28, 2024

@rumourscape Hi, Can you share training code for this model?

I have used the same training script as given in the F5 repo. The only change I made was to skip the preprocess step and directly plug in the Huggingface datasets using the HFDataset class. PS: I also avoided the convert_char_to_pinyin function to prevent extra spaces in between characters.

That's a good change to add upstream. Would make it much easier to train on other languages.

@rumourscape
Copy link
Author

rumourscape commented Dec 9, 2024

@SWivid @ZhikangNiu Hi, are you looking into this?

@SWivid
Copy link
Owner

SWivid commented Dec 9, 2024

@rumourscape yes, but busy currently, will modified as mentioned in #591
will come back and add this feature asap

@sudeep333
Copy link

It would be great if you add Hindi language support. Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants