Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collab? #7

Closed
frrad opened this issue Sep 28, 2024 · 4 comments · Fixed by #8
Closed

Collab? #7

frrad opened this issue Sep 28, 2024 · 4 comments · Fixed by #8

Comments

@frrad
Copy link
Collaborator

frrad commented Sep 28, 2024

Hi!

I've been doing some work on my fork of your project under the assumption that you weren't around any more. I would like to move to working in your repo if you are going to be around to look at PRs and are interested.

So far I've mainly done some cleanup / prep work to add types and CI

https://github.com/frrad/podly_pure_podcasts/pull/2/files
https://github.com/frrad/podly_pure_podcasts/pull/3/files
https://github.com/frrad/podly_pure_podcasts/pull/4/files

The exception being https://github.com/frrad/podly_pure_podcasts/pull/3/files where I remove local whisper support. I'm not tied to that decision and am happy to revisit it if you're interested in my contributions.

My slightly fuzzy future plans are (in no particular order):

  • transcribe audio chunks concurrently to speed up overall transcription time
  • add a minimal web UI
  • move state into sqlite instead of the current pkl arrangement
  • add an episode whitelist (controlled by ^) that controls which episodes are eligible for download to control costs
  • add the ability to manually annotate ads using the UI
  • do some prompt engineering to see to what degree I can improve ad detection for my podcasts / hopefully generally

This may all progress very slowly and/or stop at some point if I lose interest or get busy with other things but this is what I'm thinking now. Are you interested in some or all of this?

@jdrbc
Copy link
Owner

jdrbc commented Sep 30, 2024

Those changes look good! I like the roadmap too. Are you finding the detection inaccurate? I recently moved some of the prompt to the system prompt and that improved detection. I was thinking of exploring bag-of-words to improve detection further next.

I am using whisper locally just because remote whispers was the majority of the running cost. With gpt4o I'm finding the cost to be about 10 cents an hour.

@jdrbc
Copy link
Owner

jdrbc commented Sep 30, 2024

I've added you as a collaborator. I haven't used that GitHub feature much, let me know if you don't have sufficient access!

@frrad
Copy link
Collaborator Author

frrad commented Oct 1, 2024

Awesome! I will make one more PR / commit to my fork/main branch to restore local-whisper and then open one big PR into this repo from fork/main. It will probably be this weekend, but maybe later this week if I get time.

Detection is pretty good, though not perfect. A couple of random thoughts I had on the modeling front:

  • I think having a manual annotation escape hatch might be enough for me - the errors I've noticed have mostly been of the form "find most of the ad but needs some tweak" and I'm okay spending a few seconds fixing them manually
  • pure speculation but some kind of per-feed prompt seems like it could be fruitful. often a particular podcast has pretty consistent ads from episode to episode and it seems like just describing them in the prompt might help
  • you probably experimented with this but it seems like the detection is probably sensitive to the window size and "stride". if the part of the conversation the model sees begins mid-ad it seems like it would be harder to recognize. It might be interesting to play with this, even potentially sending the same segment more than once as part of different windows

@frrad
Copy link
Collaborator Author

frrad commented Oct 6, 2024

#8 brings in everything I had on the fork.

I parted out the "roadmap" part of this issue into separate issues #12 #11 #10 #9. I will plan close this issue once #8 is merged.

@jdrbc jdrbc closed this as completed in #8 Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants