-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about training vocoder #15
Comments
Hi, thanks for your issue! However, I think these questions should be addressed to the authors of MelGAN.
This is a good question. Since we use the original MelGAN implementation, I think your question should be addressed to the authors of MelGAN. I am not sure why they decided to do it. and it seems you are not the first one who wonders about it: descriptinc/melgan-neurips#36
I am not sure where you need this part of the code because I don't see it anywhere. Again, you need to ask the authors of MelGAN. Sorry for the confusion. I will remove the unnecessary code from this repository. |
Also, check this piece of code if you wonder how to reconstruct predictions of the MelGAN generator: SpecVQGAN/vocoder/scripts/train.py Lines 194 to 202 in 3894458
|
Thanks you very much
| |
15087581161
|
|
***@***.***
|
On 4/17/2022 18:51,Vladimir ***@***.***> wrote:
Also, check this piece of code if you wonder how to reconstruct predictions of the MelGAN generator:
https://github.com/v-iashin/SpecVQGAN/blob/389445808a6a8301b888fe55e2a5d27b5593cefd/vocoder/scripts/train.py#L194-L202
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, I have a problem about training mel-gan.
I find that when you train mel-gan, you normalize the audio data before transfer it to mel spectrum. e.g. In the file vocoder/mel2wav/dataset.py.
def load_wav_to_torch(self, full_path): data = np.load(full_path) data = 0.95 * normalize(data)
I just want to know why you try to nomalize it and the mutiply 0.95? After the nomalization operation, the extracted mel-spectrum is same as the orginal spectrum? I mean such operation whether influence the results when we use it to transfer the predicted specrum into wave?
Furthermore, when I use your script vocoder/scripts/generate_from_folder.py to generate sample, I find it fails (It means that the reverse audio is far from the orginal audio). After that I modify it as followwing: It works
`def main():
args = parse_args()
vocoder = MelVocoder(args.load_path)
The text was updated successfully, but these errors were encountered: