-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Summaries are getting slower and using more and more memory in long term #88
Comments
How was this assessed? Do you have a reproducer? Is this an issue with how aioprometheus uses the quantile library? Or is there a bug in the quantile library itself? Why is the use of one invariant vs > 1 invariant relevant? |
I've been experimenting with this - I think there's either a bug (or deliberate difference) in the quantile library compared to the quantile libraries used in other language prometheus client libraries. To test this theory, I calculated the The Go implementation took ~1.5 seconds to run and maintained ~1250 samples. Using the same class of input, the Python implementation maintained 104659 samples and took ~192 seconds. Both libraries claim to use the same algorithm from paper Effective Computation of Biased Quantiles over Data Streams. I don't believe this is an issue specific to aioprometheus. However, one thing to note is that other prometheus client libraries (including Java, Go) implement sliding windows for Summaries. If I understand correctly, having sliding windows in aioprometheus's Summary implementation would provide an upper limit on how many samples would be retained (the maximum number of observations logged within the window). Perhaps supporting sliding windows should be considered. It looks like someone has tried: https://github.com/RefaceAI/aioprometheus-summary/blob/main/aioprometheus_summary/__init__.py |
I've written my own implementation, inspired by the Go implementation. It's performance and memory utilization is much better, and it passes the Go implementation's tests. I hope to release it publicly sometime soon. |
Hey Jacob. Did you manage to release the faster implementation? Having a look at the source, it seems that it's still using |
Not yet, but I did get approval to do so - I'll try to share as soon as I have a chance. |
No problem, thanks for the update. |
The underlying quantile package seems to getting slower and using more memory if configured with more than one invariants which is the default in both, aioprometheus and quantile. This is getting an issue for long-running services which gather millions of measurements for one summary metric. The response time for premetheus can increase to over one second and more.
A current workaround is to use exactly one invariant/quantile (if it's feasible for your use cases) so that this issue is not triggered within the quantile package.
The text was updated successfully, but these errors were encountered: