audio file length #33

fabianbosshard · 2024-04-23T18:41:09Z

Hi Yuan

We have another question: What was the length of the audio files you used? In the paper it is written thatthey are 10 seconds but with 10 seconds the resulting spectrograms (from torchaudio.compliance.kaldi.fbank) are 998 frames (with the frame_shift set to 10ms and the frame_length set to 25ms) and thus the remaining 26 frames are being zero padded by the dataloader (if the target_length is set to 1024).

Best Regards,
Fabian

YuanGongND · 2024-04-23T19:15:44Z

They are 10 seconds (Audioset), a small padding is expected. 1024 is just an integral powers of 2, which can be easier to be split into 16*16 patches (suppose no overlap). A small padding won't impact the performance.

-Yuan

fabianbosshard · 2024-04-23T19:41:23Z

Okay, thanks for your quick reply.

Since we use the frame-based model (we want to finetune it for Speaker Verification), I think we set the target_length to 998. But maybe we use 390 masked patches instead of 400 (to leave the ratio of masked/total numer of frames close to the original setup from your paper).

Best Regards,
Fabian

YuanGongND · 2024-04-23T20:33:35Z

yes, that sounds reasonable. But again, I would expect this will only lead to minor difference.

There might be some hard coded 1024 you need to change in this codebase.

indraneelrp · 2024-07-16T08:40:47Z

what happens when we input a longer audio file (like 1 min) for inference? it did give an output. Has it analysed the whole clip or only a 10 second portion of the clip?

YuanGongND added the question Further information is requested label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio file length #33

audio file length #33

fabianbosshard commented Apr 23, 2024

YuanGongND commented Apr 23, 2024

fabianbosshard commented Apr 23, 2024

YuanGongND commented Apr 23, 2024 •

edited

Loading

indraneelrp commented Jul 16, 2024

audio file length #33

audio file length #33

Comments

fabianbosshard commented Apr 23, 2024

YuanGongND commented Apr 23, 2024

fabianbosshard commented Apr 23, 2024

YuanGongND commented Apr 23, 2024 • edited Loading

indraneelrp commented Jul 16, 2024

YuanGongND commented Apr 23, 2024 •

edited

Loading