Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect segregation of voiced and unvoiced segments #31

Open
tumul-80 opened this issue Jan 18, 2022 · 0 comments
Open

Incorrect segregation of voiced and unvoiced segments #31

tumul-80 opened this issue Jan 18, 2022 · 0 comments

Comments

@tumul-80
Copy link

tumul-80 commented Jan 18, 2022

Hello,

I will like to get the voiced segments from any audio file ( .wav format) and plot it against the time series of the original audio. I modified your code a bit and ran it on a simple audio file. For instance, I recorded a simple audio file with just my voice and tried to find voiced segments, but the code mistakenly gets voiced segments and classifies most of actual human voice as "Unvoiced segments"

What should I do?

audio_data, sampling_rate = librosa.load('try_voice.wav')
plt.figure(figsize=(14, 5))
librosa.display.waveplot(audio_data, sr=sampling_rate)

vad=wb.Vad()
filename= 'try_voice.wav'
audio= audiosegment.from_file(filename)

seg = audio.resample(sample_rate_Hz=32000, sample_width=2, channels=1)
results = seg.detect_voice()
voiced = [tup[1] for tup in results if tup[0] == 'v']
unvoiced = [tup[1] for tup in results if tup[0] == 'u']

voiced_segment = voiced[0].reduce(voiced[1:])
voiced_segment.export("voiced.wav", format="WAV")
voiced, sampling_rate_v= librosa.load('voiced.wav')

duration = len(voiced)/sampling_rate_v
time = np.arange(0,duration,1/sampling_rate_v) #time vector
plt.figure()
librosa.display.waveplot(voiced, sr=sampling_rate_v)
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant