Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete processed events #94

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Delete processed events #94

wants to merge 12 commits into from

Conversation

dcadenas
Copy link
Contributor

This is for https://github.com/verse-pbc/issues/issues/142

This PR adds event cleanup to prevent indefinite storage in the event service. Events are now deleted after successful processing, and a rotating Bloom filter is used to detect duplicates. Unused event retrieval endpoints and handlers have been removed, and metrics for Bloom filter usage and saturation have been added.

@dcadenas dcadenas requested a review from mplorentz January 31, 2025 18:05
Copy link
Member

@mplorentz mplorentz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bloom filters, cool! But I'm confused about their suitability here after doing some research. Bloom filters are susceptible to false positives, and since we are using them skip processing of events then any false positive will result in an event that should have been processed not being processed. So with an error rate of 0.1% we will skip one in every thousand events we should have processed? What am I missing here?

@dcadenas
Copy link
Contributor Author

dcadenas commented Feb 5, 2025

The error rate is actually lower, it’s currently set to 0.01%, and we could tweak it even further if needed (e.g., down to 0.001%). The idea here is that even if we miss a few events due to false positives, we’ll catch up on the next update for those events. Since we’re dealing with replaceable events, missing one update isn’t the end of the world—we’ll process it the next time it comes through. The chance of missing the same event twice is extremely low (0.01% × 0.01% = 0.0001%, or even lower if we adjust the rate).

We’ve already been okay with some level of event loss (like with events that are too large), so this feels like a natural extension of that trade-off. The occasional miss from the Bloom filter is a small price to pay for the benefits: it simplifies the system, removes the need for a dedicated table for duplicate checks, and keeps the server healthier. In this case, I think prioritizing reliability and scalability over 100% precision makes sense.

That said, if these occasional misses feel like a no-go, we can always switch back to using a table for duplicate checks. Let me know what you think!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants