Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voice Control ideas (brainstorming) #587

Closed
MithrilMan opened this issue Dec 5, 2024 · 1 comment
Closed

Voice Control ideas (brainstorming) #587

MithrilMan opened this issue Dec 5, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@MithrilMan
Copy link

MithrilMan commented Dec 5, 2024

Since this is a diffusion model, I'm wondering if would be possible to implement something like ControlNet did for StableDiffusion/Flux

Basically is it possible to build models that drive the audio generation by, for example, extracting the emotion from a voice reference (different from the reference used to "inpaint"), or the roughness, etc...

I'm not asking now to implement this but I just wanted to understand the possibilities given by a diffusion model for voices

@MithrilMan MithrilMan added the enhancement New feature or request label Dec 5, 2024
@SWivid
Copy link
Owner

SWivid commented Dec 5, 2024

yes, possible. the reason we are open-sourcing is exactly to make it easier for all people to take a try for any idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants