Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web UI manual ad annotation functionality #12

Open
frrad opened this issue Oct 6, 2024 · 3 comments
Open

Web UI manual ad annotation functionality #12

frrad opened this issue Oct 6, 2024 · 3 comments

Comments

@frrad
Copy link
Collaborator

frrad commented Oct 6, 2024

Add the ability to use a web UI to manually annotate ads.

Probably follows #11

This may end up being the same PR as #9

Basically:

  • move transcripts into the DB with some table which is effectively Segment
  • give that table a column for 👍 / 👎
  • add some endpoints and frontend for changing the classification of the segment
@frrad frrad mentioned this issue Oct 6, 2024
@dfjones89
Copy link

I wonder if another (potentially simpler) approach is to allow users to provide a list of known advertisers, in the hope of giving the LLM a greater chance of identifying ad segments. I've tweaked my system_prompt.txt to add the below text. It's a bit early to see if ad detection has been improved, but I'll report back here with my findings once I've listened to some newly processed episodes 👍

Known advertisers are listed below, though this list is not exhaustive and you should expect to encounter adverts from other companies.
If a known advertiser is mentioned in a section of transcript, you can be more confident in classifying that section as an advertisement.

Known advertisers:
 - Better Health
 - Shopify

@frrad
Copy link
Collaborator Author

frrad commented Nov 7, 2024

That definitely seems like it would help. If it does, maybe we could consider adding first class support for basically an known_advertisers: Optional[List[str]] per podcast 🤔

I still think the web UI transcript thing is going to be helpful though. Even if the ad list is perfectly effective, some flow where you

  1. read transcript
  2. add to advertiser list
  3. re-run detection

is going to be much nicer in some form of UI

@dfjones89
Copy link

As promised, just a quick follow-up: Updating my prompt to include a list of known advertisers has improved the detection of adverts that were previously being overlooked 🙌 Being able to update this list through a web interface would be a lovely feature. Thanks again for all your work! 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants