Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

audio-offset-finder not working but no error messages -- advice? #54

Closed
mjbaldwin opened this issue Jun 13, 2024 · 19 comments
Closed

audio-offset-finder not working but no error messages -- advice? #54

mjbaldwin opened this issue Jun 13, 2024 · 19 comments
Assignees

Comments

@mjbaldwin
Copy link

mjbaldwin commented Jun 13, 2024

I'm running audio-offset-finder but no matter what I do I get the result:

Offset: 0.0 (seconds)
Standard score: inf

I have ffmpeg installed (use it all the time), I'm running on macOS Sonoma (up to date), with a python3 installation managed by homebrew, and I installed audio-offset-finder using pip3 install audio-offset-finder --user --break-system-packages. My Python version is 3.12.3.

I'm running the command audio-offset-finder --find-offset-of jingle.wav --within episode.wav

If I use --show-plot it shows a red vertical line at 0 and nothing else.

Both of the .wav files are 44.1 kHz 16-bit mono, and play correctly in any application.

The correct result should be that jingle.wav is a couple of minutes in episode.wav, it's not at the zero point.

I'm baffled because while it's not locating the match, it's not generating any kind of error either, but the result is clearly a kind of error. And it's clearly doing something, because it takes several seconds to run. (jingle.wav is a few seconds long; episode.wav is about an hour long.)

Is there any advice on how I could debug this?

@mjbaldwin
Copy link
Author

In case it's part of the problem, the reason my install command ends with --user --break-system-packages is described here:

https://stackoverflow.com/questions/75608323/how-do-i-solve-error-externally-managed-environment-every-time-i-use-pip-3

I don't really use Python much and don't necessarily understand if what I've done is the "proper" way to install audio-offset-finder -- but I just want it to be available as a system-wide command-line tool. I don't want to deal with creating virtual environments or anything. And audio-offset-finder isn't available in homebrew itself, the way apparently some other common Python packages are.

@stephenjolly
Copy link
Contributor

I guess my first suggestion would be to run the unit tests and see if they throw up any issues. You'll want to change into the root directory of the checked out source code and run pytest. You may need to install pytest via pip if you haven't done so already.

Another possibility, which you've probably already considered, but which I'll mention just because it would explain everything that you're seeing, would be that due to some glitch or oversight "jingle.wav" somehow contains exactly the same audio data as "episode.wav", or they start with exactly the same audio (ie the offset is genuinely zero).

@stephenjolly stephenjolly self-assigned this Jun 13, 2024
@mjbaldwin
Copy link
Author

mjbaldwin commented Jun 13, 2024

Thanks, Stephen. I'm going to leave two separate comments here.

The first is just regarding how to properly install audio-offset-finder on macOS. I uninstalled it from the "incorrect" way I'd installed it:

pip3 uninstall audio-offset-finder --break-system-packages

And then figured out how to install audio-offset-finder "correctly" per https://docs.brew.sh/Homebrew-and-Python using pipx rather than pip, which turns out to be homebrew's recommended way of installing command-line Python packages:

brew install pipx
pipx install audio-offset-finder

So this is just a note that, perhaps in the project's README, you might want to note that Mac users who use homebrew and want to use it as a system-wide command-line tool will want to install it using pipx rather than pip.

This didn't fix the problem at all, but just wanted to elminate the install method as a potential problem.

@mjbaldwin
Copy link
Author

mjbaldwin commented Jun 13, 2024

So then I tried running audio-offset-finder's pytest in a virtual environment and it worked perfectly fine. I tried running audio-offset-finder directly from the command line on the included test files (locating timbl_2.mp3 within timbl_1.mp3) and it also worked perfectly fine:

Offset: 12.256 (seconds)
Standard score: 29.416660811386247

But it continues to fail completely on my particular files. And when I display the plot, it seems to only be analyzing a single millisecond? See:

image

So I'm wondering if you could test my files out on your own machine to perhaps figure out what's going on?

The "within" file is an hour-long podcast episode (stereo mp3 format) which you can download at:

https://chtbl.com/track/9E9755/arttrk.com/p/ST44R/pdrl.fm/f3efd0/stitcher.simplecastaudio.com/c945bd13-c7f3-4f22-95b5-4bf98e12b21f/episodes/b9d0a005-a4b4-4918-af4a-56dc8050afe0/audio/128/default.mp3?aid=rss_feed&awCollectionId=c945bd13-c7f3-4f22-95b5-4bf98e12b21f&awEpisodeId=b9d0a005-a4b4-4918-af4a-56dc8050afe0&feed=dHoohVNH

And then the audio I'm searching for within it is attached as "jingle.wav", which you'll have to extract since GitHub doesn't support uploading .wav files directly:
jingle.wav.zip

And so then the command I'm running is:

% audio-offset-finder --find-offset-of jingle.wav --within episode.mp3           
Offset: 0.0 (seconds)
Standard score: inf

The jingle.wav file should be located by audio-offset-finder at approximately the 8:01 (8m1s) mark of the episode, which you can verify with your own ears. And I extracted jingle.wav from a different episode, so it should be virtually identical content except for mp3 compression artifacts.

I'm hoping that you encounter the same bug, and can hopefully figure out what the culprit is? These are just bog-standard podcast audio files on a pretty standard Mac setup, so I have to assume that if I'm having this problem other people might be too.

(And these are stereo files rather than the mono files I originally described, but it doesn't seem like that distinction is relevant since you're using ffmpeg to downsample everything anyways.)

Thanks!

@chrisn
Copy link
Member

chrisn commented Jun 13, 2024

I can confirm I get the same result here:

Offset: 0.0 (seconds)
Standard score: inf

The audio seems to be at about 7m6s though?

@mjbaldwin
Copy link
Author

Glad there's confirmation.

The podcast server might be doing dynamic ad insertion at the start of the file, serving different users different ads within (or none at all). So that might be why @chrisn is finding it at a different exact location. Unfortunately GitHub has a 25 MB attachment size limit, and the file is around 70 MB.

But the location of the jingle should be generally in that range -- it divides the intro segment from the main interview segment. I can upload a truncated version if desired though.

@hsauod
Copy link

hsauod commented Jun 18, 2024

So then I tried running audio-offset-finder's pytest in a virtual environment and it worked perfectly fine. I tried running audio-offset-finder directly from the command line on the included test files (locating timbl_2.mp3 within timbl_1.mp3) and it also worked perfectly fine:

Offset: 12.256 (seconds)
Standard score: 29.416660811386247

But it continues to fail completely on my particular files. And when I display the plot, it seems to only be analyzing a single millisecond? See:

image So I'm wondering if you could test my files out on your own machine to perhaps figure out what's going on?

The "within" file is an hour-long podcast episode (stereo mp3 format) which you can download at:

https://chtbl.com/track/9E9755/arttrk.com/p/ST44R/pdrl.fm/f3efd0/stitcher.simplecastaudio.com/c945bd13-c7f3-4f22-95b5-4bf98e12b21f/episodes/b9d0a005-a4b4-4918-af4a-56dc8050afe0/audio/128/default.mp3?aid=rss_feed&awCollectionId=c945bd13-c7f3-4f22-95b5-4bf98e12b21f&awEpisodeId=b9d0a005-a4b4-4918-af4a-56dc8050afe0&feed=dHoohVNH

And then the audio I'm searching for within it is attached as "jingle.wav", which you'll have to extract since GitHub doesn't support uploading .wav files directly: jingle.wav.zip

And so then the command I'm running is:

% audio-offset-finder --find-offset-of jingle.wav --within episode.mp3           
Offset: 0.0 (seconds)
Standard score: inf

The jingle.wav file should be located by audio-offset-finder at approximately the 8:01 (8m1s) mark of the episode, which you can verify with your own ears. And I extracted jingle.wav from a different episode, so it should be virtually identical content except for mp3 compression artifacts.

I'm hoping that you encounter the same bug, and can hopefully figure out what the culprit is? These are just bog-standard podcast audio files on a pretty standard Mac setup, so I have to assume that if I'm having this problem other people might be too.

(And these are stereo files rather than the mono files I originally described, but it doesn't seem like that distinction is relevant since you're using ffmpeg to downsample everything anyways.)

Thanks!

I got the same issue.
Testing the demo files (timbl_2.mp3 / timbl_1.mp3) , worked perfectly fine. However, got
Offset: 0.0 (seconds)
Standard score: inf
on my own wavs(48k, 122.25 kbit/s, around 1mins)

@hsauod
Copy link

hsauod commented Jun 18, 2024

I try to change the length of two files, and then worked.
I probably find the trick----the difference of 2 length should not be too big.

@stephenjolly
Copy link
Contributor

OK, it looks like there's a longstanding bug in the core logic that can cause the offset finding process to fail when the two input files are very different in length. A proposed fix is being developed in #55.

@stephenjolly
Copy link
Contributor

The requested change to the README is addressed in #56.

@stephenjolly
Copy link
Contributor

stephenjolly commented Jun 18, 2024

These PRs have now been merged, and the tool now appears to work correctly with @mjbaldwin's test data. Please test it and let us know if there are any remaining issues.

@mjbaldwin
Copy link
Author

Thank you @stephenjolly !

I'd like to test out the changes but I don't know how to execute this as a command-line tool simply from downloading the repository.

I did try manually replacing the files audio_offset_finder.py and cli.py in my pipx installation with the new ones, and I'm getting a result now but it's not quite the right one -- I see you now depend on a newer version of numpy and I hope that's not causing issues.

But when I run my command I get:

% audio-offset-finder --find-offset-of jingle.wav --within episode.mp3 --show-plot      
Offset: -415.456 (seconds)
Standard score: 14.640932117984137

Which is at least a value, but the correct offset should be around 481, not 415 and not a negative value either if I understand the meaning of that correctly.

Interestingly enough, the plot shows the correct peak around 481, and the red line is over to the side where there is no signal:

image

Does this still need some further logic fixes? Or have I not updated the tool correctly?

If you're able to give me instructions or point me to a link for how to install audio-offset-finder from the GitHub .zip file into a virtual environment together with its exact dependencies as a command line tool, then I can do that -- but I really don't quite know how.

@mjbaldwin
Copy link
Author

Oh wait, now this is quite surprising --

I was curious why the analyzed audio was showing up in the plot only as 900 seconds, instead of the 70+ minutes it should be. I saw that you have a --trim argument that defaults to 900. For my purposes I don't want to trim anything. I tried 0 wondering if that would disable it, but that doesn't work. So then I tried an arbitrarily large number much longer than my files, and now audio-offset-finder locates the correct value:

 % audio-offset-finder --find-offset-of jingle.wav --within episode.mp3 --show-plot --trim 44000
Offset: 480.784 (seconds)
Standard score: 15.604841031379348
image

So it seems there is still a bug with some lengths but not others?

(Separately, I might suggest that the tool's default behavior not default to any trimming at all, as that seems like unexpected behavior. But now that I've figured out how to work around that, that's not so important of course.)

@mjbaldwin
Copy link
Author

Oh and in case it helps, I already uploaded the jingle.wav above, but here's the exact episode.mp3 I'm using, I uploaded to my own cloud drive to share (zipped): https://drive.google.com/file/d/1WuGrFoKBOy9UbCOlL4-Cxn9xjA08yW4E/view?usp=sharing

Just so you can try to reproduce my output precisely.

@stephenjolly
Copy link
Contributor

It looks like the previous round of bug fixes missed some code that were affected by the changes. Another round of bug fixes (and some improved tests that would have caught these issues in the first place, and will hopefully prevent regressions in future releases) is in #58.

@stephenjolly
Copy link
Contributor

Trimming the audio by default makes sense in some use cases as a performance optimisation, and we think that probably the tool was originally written with one of those use cases in mind (e.g. aligning independent recordings of the same live event). It is obviously causing problems here though, and we agree that it should not be the default. Its removal is addressed in #59.

@stephenjolly
Copy link
Contributor

stephenjolly commented Jun 19, 2024

Both PRs have been approved and merged into the main branch, so you should be able to update your code again and see if it now works as expected. (FWIW I was able to replicate your results again precisely, and the new changes fix the issues for me.)

Because the project is structured as a python module with an associated command-line tool, there's no way to run it directly from a repository download. (It might be possible to modify it to make that possible, but that looks hard, and doesn't feel like a priority at present.) Updating the files manually as you describe is likely to be effective, but you might feel that it lacks elegance. Alternatively, running pip3 uninstall audio-offset-finder and then pip3 install . from the checked out repository root should work since you used pip3 to install the module in the first place - that will remove the PyPi version and install it from the local copy.

@mjbaldwin
Copy link
Author

mjbaldwin commented Jun 20, 2024

% pipx uninstall audio-offset-finder
uninstalled audio-offset-finder! ✨ 🌟 ✨
% pipx install .
  installed package audio-offset-finder 0.5.4, installed using Python 3.12.3
  These apps are now globally available
    - audio-offset-finder
done! ✨ 🌟 ✨
% audio-offset-finder --find-offset-of jingle.wav --within episode.mp3 
Offset: 480.784 (seconds)
Standard score: 15.604841031379348
% audio-offset-finder --find-offset-of jingle.wav --within episode.mp3 --trim 900
Offset: 480.784 (seconds)
Standard score: 14.640932117984137

Everything works like a charm! Just tried locating several other jingles in a number of other episodes and the results are all accurate and the --show-plot graphics all display intuitively as well, without extra whitespace to the left.

Thanks so much! Now the tool is exactly what I need, and hopefully it will help others more too.

I was very curious why the --trim option was enabled by default, but now that I understand you built the tool to align different versions of recordings, that totally makes sense. (I do the same thing in Adobe Premiere using the Synchronize command.) I'm happy to see it no longer enabled by default -- I'm matching start/stop jingles in order to extract the segments of recordings between them.

Thanks again @stephenjolly ! Looking forward to the official release of 0.5.4.

@stephenjolly
Copy link
Contributor

Many thanks in turn for your help in identifying the issues so we could fix them, @mjbaldwin and @hsauod. I'm going to close this issue now, and dig out my notes on publishing a new package to PyPi...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants