Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry / rate-limit queries that failed due to S3 throttling #320

Open
skuzzle opened this issue Apr 8, 2024 · 1 comment
Open

Retry / rate-limit queries that failed due to S3 throttling #320

skuzzle opened this issue Apr 8, 2024 · 1 comment

Comments

@skuzzle
Copy link

skuzzle commented Apr 8, 2024

Is your feature request related to a problem? Please describe.
We are sometimes seeing S3 throttling errors on the UI on some of our dashboards. This even happens for queries that are already cached with Athena's query reuse feature. I understand those might be root caused by some sub-optimal partitioning/data layout in our Athena setup (which we are unable to change at the moment). However, I think throttling can naturally happen if you have lots of data to crawl through.
The way I understood S3 is, that throttling happens while S3 is trying to scale up to the amount of concurrent requests it needs to handle. Thus it is signalling to the client to slow down its request rate. This situation is currently not handled gracefully by the Grafana Athena datasource.

Describe the solution you'd like
If my understanding of S3 throttling is correct then there should be some client site retry with backoff mechanism for queries that fail because of S3 throttling. I understand that introducing a rate limit might not be straight-forward as it likely requires tracking some global state on the Grafana Server.

Describe alternatives you've considered
Sadly, I've found no alternatives yet. In a perfect world maybe Athena should already handle this situation more gracefully but we have found no respective configuration options.

Additional context
We have some automation in place that tests all of our dashboard's Athena queries against Grafana's /api/ds/query endpoint. In these tests we faced the same throttling issues and were able to overcome them by adding a retry mechanism and stepwise lowering the rate limit.

@iwysiu
Copy link
Contributor

iwysiu commented Apr 9, 2024

Hi @skuzzle , thanks for the feature request! I looked into it and I can understand it being an issue, though most of the advice I see about it involves changing the Athena configuration instead of the querying. I'll move it into the backlog for us to consider.

@iwysiu iwysiu moved this from Incoming to Backlog in AWS Datasources Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Development

No branches or pull requests

2 participants