Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mel or wav? #18

Open
howitry opened this issue Sep 6, 2024 · 1 comment
Open

Mel or wav? #18

howitry opened this issue Sep 6, 2024 · 1 comment

Comments

@howitry
Copy link

howitry commented Sep 6, 2024

Great work! I want to ask if you have tried using mel as input? If mel is used as input and the same bitrate is maintained (e.g. frameshift=256, encoder downsampled by 3 times), will the performance of the model change significantly?

@jishengpeng
Copy link
Owner

Great work! I want to ask if you have tried using mel as input? If mel is used as input and the same bitrate is maintained (e.g. frameshift=256, encoder downsampled by 3 times), will the performance of the model change significantly?

From my perspective, it seems feasible to input Mel spectrograms and maintain the same or even higher compression ratios without significantly degrading performance. However, I am puzzled as to why most current codecs do not adopt this approach. What is the rationale behind this decision?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants