Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradio All-in-One: Preprocess, Train, TensorBoard, and Interface #23

Open
lpscr opened this issue Sep 12, 2024 · 4 comments
Open

Gradio All-in-One: Preprocess, Train, TensorBoard, and Interface #23

lpscr opened this issue Sep 12, 2024 · 4 comments

Comments

@lpscr
Copy link

lpscr commented Sep 12, 2024

Hi, @KdaiP

I’m not sure where to post this, so I’ll share it here.

After a lot of testing, I’m working on a quick Gradio all-in-one solution . It includes preprocessing, training, TensorBoard , and the interface.

I’m still working on it and haven’t finished yet. I need to fix some issues, but I hope you’ll like it once it’s ready. I’ll send it for testing as soon as it’s done. Since I’m new to GitHub, i dont know how share code for your repo ?

Here’s a quick preview of what it looks like so far. Please enable the voice feature, as it’s currently muted, to experience the interface.

Let me know what you think!

QuickGradio.mp4
@KdaiP
Copy link
Owner

KdaiP commented Sep 13, 2024

Hi, This sounds really cool! Once you've finished your modifications, you can create a pull request to share the code.

Also, it seems like the issue of audio cutouts has improved in this new model.

Maybe in the future, we can add components for data annotation and data cleaning (which I'm currently organizing), similar to the one-click packages for TTS and GPT-SoVITS, to create an out-of-the-box workflow.

Looking forward to seeing your progress!

@juntaosun
Copy link

@lpscr
You can create a project branch on GitHub, upload your updated code, and then push it to the main branch to synchronize the code.

@lpscr
Copy link
Author

lpscr commented Sep 13, 2024

Thank you all!

I've finished a new version, and I hope you like it.
I plan to upload it tomorrow and make a pull request as you suggested.

I have a quick question:
Can I create a fork, add the files, and then submit a pull request?
I'm still new to this process, so I just want to make sure.

In this version, I added:

The ability to start, stop, and resume training, with a save and config option so you don't lose progress.
The ability to start and stop TensorBoard.
A random sample selector for training data to make it easier to use, as well as a reference to compare results.
A seed option, allowing you to use a random seed or fix it if needed.
Automatic model downloads on the first run if the models aren't found locally.

I have a question about the sample in the progress tab:
Do we need to resample the files?

Let me know what you think of the new version. I ran some tests, and it's working great! You can easily fine-tune voices, and everything is well-organized. I didn't touch the core code, so there shouldn't be any conflicts. I just created a separate Python file and imported code from your repo, or copied some parts. I also added comments, so it's easy to understand and follow what I did.

full.mp4

@KdaiP
Copy link
Owner

KdaiP commented Sep 14, 2024

Hi, thank you for your hard work on this! Your implementation is both elegant and well-organized!

In addition, resample is not required because the training target of StableTTS (mel spectrogram) is already extracted from resampled audio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants