-
-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: AutoPitch - automatic pitch detection #786
Comments
SVC is no longer used, it is too old, instead we use RVC which is much better and more up to date. |
pero has leído? |
perdona, no se mucho ingles |
Creo que piden la misma función en RVC. Estoy de acuerdo con ellos, ¡sería muy útil! entonces, al hacer inferencias, no es necesario ajustar los semitonos, lo hace por sí solo detectando el tono del audio de entrada. Sorry for the bad spanish, this is auto-translated, I just wanted to make sure you understand. For those who don't speak spanish, this feature would be very useful because you wouldn't have to adjust the pitch, ie -12 +12 semitones. It would automatically detect the pitch so if your model is a male voice and your input is a female voice, it would make the output pitch lower to compensate and make it sounds more natural. I used this feature all the time in SVC and found it very useful! Although It didn't work that great all the time, it was still useful to have. Not sure It would work all that great with singing audio though, but when using speaking audio it's really great. |
For such functionality to work, there should be some kind of record of what the model was trained on |
Having looked into it a bit more, the way it works on SVC is that an f0 predictor is trained alongside the main model. Which explains why It wouldn't be possible with RVC. It is a shame as this would be very useful. |
I also had really excellent experience with SVC Automatic pitch detection, it made the spoken voice really realistic, actually better than RVC or Applio where to be realistic, it's often necessary to do a lot of manual editing. |
Me too, It was really useful for speaking audio. Maybe some brave soul can add this to Applio. |
Shouldnt this be pretty simple in general? Calculate a mean f0 from the training data. Then calculate mean f0 from your clip before inference and shift by the difference. |
Interesting. |
@Bebra777228 could you share the code used on SVC to do that "AutoPitch" function? |
I can't precisely determine which file this is implemented in, but I assume it might be the models.py file. This file contains the parameter Overall, it's easiest to search through the code. You might find something useful if you use this search link. |
yes, it's a code that detect input pitch and compare to model pitch, then calculate best transpose. |
I implemented it for fun, making the algorithm a bit more robust for handling edge cases, I also added RMS extraction in same time. It's not a game changer but it's convenient. |
feel free to open pr to see your code |
@tomakorea could you share it by any chance? |
Description
Recently, a question regarding this has already been asked that, in my opinion, was not formulated quite accurately.
I would like to know if you plan to implement a function for automatic pitch detection, similar to how it was implemented in SVC.
I am not sure if this feature is still available in SVC, but about a year and a half ago, during inference, you could check the 'autopitch' box. This allowed the model to automatically adjust to the pitch of the voice in the source recording, resulting in a more realistic voice than with manual pitch adjustment (even though this feature worked poorly, the results were good). At least, that's how it seemed to me.
Problem
Proposed Solution
Alternatives Considered
The text was updated successfully, but these errors were encountered: