Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some audio files result in NAN values in aperiodicity #50

Open
Kal213 opened this issue Jun 24, 2020 · 3 comments
Open

Some audio files result in NAN values in aperiodicity #50

Kal213 opened this issue Jun 24, 2020 · 3 comments

Comments

@Kal213
Copy link

Kal213 commented Jun 24, 2020

When using wav2world (or DIO and D4C), I've noticed that sometimes the aperiodicity returns with nan values in it. This causes me issues when I try to synthesize the audio back.

I've been unable to find exactly what causes it, but I've found that taking the absolute value of the audio data before inputting it will "fix" the issue.

I've attached a short python script and an audio file that causes the problem on my build. If you have any ideas what's causing this or how to fix it please let me know!

As a final note, some audio samples with negative values work well with this. It's very odd, and maybe just an issue with WORLD itself.

example.zip

@JeremyCCHsu
Copy link
Owner

Hi, @Kal213 , thank you for reporting this issue. I tested it out and found it really strange.

It seems that casting the np.float32 data read from librosa.read with the .astype(np.float64) method in numpy caused the issue.

For now, I can only suggest that you load the wav file using soundfile.read which returns np.float64 by default.
world.wav2world seems to work fine with the example you attached and does result in nan APs.

If you have other solutions, or find this workaround not working, please share your findings with us. Thanks.

@jerry-cj-chang
Copy link

I guess it's because librosa.load will read and interpolate points to the specified sample rate.
Some of these interpolated points cannot be recast to int16, which i guess is the problem of this NAN issue.
You can fix this by casting a float32 to int16 and then to double, or not set the sampling rate argument in librosa.load to "None", so librosa won't do resampling.

@tshmak
Copy link

tshmak commented Jan 3, 2023

I also just discovered this problem, and yes, I found the solution to be the same as jerry-cj-chang's. So your float has to be recast to 16bit and then to 64bit before passing it to world. Hope there's a proper fix soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants