Gradio demo #110
Replies: 17 comments
-
I'm not familiar with Gradio. I did try it for StyleTTS but had no success. I would take a look at it later when I get time, but if anyone is interested in making a demo for now feel free to contribute! |
Beta Was this translation helpful? Give feedback.
-
Someone is already working on it: #53, and we are figuring out some details of it. |
Beta Was this translation helpful? Give feedback.
-
Hi @AK391. I’ve released a Gradio demo here with voice cloning, multi-speaker support, and LJSpeech support. |
Beta Was this translation helpful? Give feedback.
-
@fakerybakery I think for the default voices, it would be great if you could find all the audio samples in the training data and compute the styles of each sample and take the average, then save it as the speaker embedding. This is probably more efficient than computing the style every time it is run, and also more accurate reflection of the speaker. |
Beta Was this translation helpful? Give feedback.
-
Yes, you’re probably right. No wonder starting the demo took so long each time! Thank you, I’ll push a fix tomorrow :) |
Beta Was this translation helpful? Give feedback.
-
Hi, someone asked here if I would release a local Gradio GUI to run (the comment was later deleted for some reason, but it was still in my inbox). I am planning to eventually release it and perhaps make a PR to the main repository, but the code quality is currently pretty... low. I'm going to clean it up a bit and then try to release it. |
Beta Was this translation helpful? Give feedback.
-
Thanks to @AK391 for posting this solution on X/Twitter! Just realized you can run any Hugging Face space on Docker. docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all \
registry.hf.space/styletts2-styletts2:latest python app.py |
Beta Was this translation helpful? Give feedback.
-
A few more features that could be added:
Thanks again for your help in making the demo! |
Beta Was this translation helpful? Give feedback.
-
Hi, I can try to implement this. 1 and 2 seem doable, but 3 seems a bit harder. I'll look into this later today! Thanks for the suggestions! |
Beta Was this translation helpful? Give feedback.
-
Can you please remove the "Access code" in the "Long Text"? |
Beta Was this translation helpful? Give feedback.
-
Ok, I'll remove the long text feature in a couple minutes or add a character limit |
Beta Was this translation helpful? Give feedback.
-
Hi @yl4579, a couple things:
|
Beta Was this translation helpful? Give feedback.
-
@fakerybakery thanks a lot for your reply , I'm looking forward for the local version ,I tested the huggingface demo and it looks awesome ! |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Would you like to start with making a local copy of the current HF demo and then iterate over it to improve it? @fakerybakery |
Beta Was this translation helpful? Give feedback.
-
Yeah, I'll start doing that. However I'm using macOS and can't figure out how to install espeak-ng for phonemizer (I tried MacPorts but it didn't work - maybe I'll develop it on a VM) |
Beta Was this translation helpful? Give feedback.
-
@fakerybakery Thank you for this! Just came across it yesterday, and it looks fantastic. Have you considered (and if not, definitely no worries) building out a handler.py for StyleTTS2 on HuggingFace, to allow people to use it via the hosted Inference Endpoints? |
Beta Was this translation helpful? Give feedback.
-
HI, congrats on StyleTT2, would be great to setup a gradio demo for it on Hugging Face, you can see the guide to get started here: https://huggingface.co/docs/hub/spaces-sdks-gradio and here is a recent example: https://huggingface.co/spaces/coqui/xtts, @yvrjsharma
Beta Was this translation helpful? Give feedback.
All reactions