-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add better support for throttling exceptions #11
Conversation
In cases where requests fail for non-technical reasons, we often do not want exponential backoff. Notably when requests are throttled it is more appropriate to wait a specified amount of time before retrying. Often we will learn this from a `Retry-After` header. This change introduces `BackoffAndRetryException` which can be used to sleep a fixed amount of time before retrying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reviewing this, and resetting the retry window and count looked like a neat solution initially, but then I realized it breaks the naming of our arguments, which is exactly the kind of footgun we try to avoid in Opnieuw: max_calls_total
is no longer a maximum if the count can reset.
There is also one potentially dangerous outcome: if you happen to hit an edge case in an external service, so it always fails, but there is also an aggressive rate limit on the endpoint, then we can end up in an infinite loop, where before we hit the maximum number of failures we get rate limited, and after that we start running into the same errors.
I don’t think this is a hypothetical problem. We integrate with one API in particular that has rate limits of a few calls per hour, and I’ve seen us consistently getting internal server errors for one particular request.
Making retries on throttle a separate decorator would fix both problems (we can put a maximum on the number of attempts there as well, and the meaning of the inner retry remains unchanged), but it introduces some new challenges:
- You can forget to add it.
- You can put the throttle-retry and error-retry in the wrong order.
I think the latter one could be prevented with some properties and an assertion. The former one is not so bad, because rate limiting is really a separate thing from transient errors, and we already need to detect the case specially anyway.
What do you think?
self.counter = 0 | ||
start = time.monotonic() | ||
|
||
with retry_immediately(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so this does mean that backoff throttle waits are not included in the retry_immediately
. I would expect them to be skipped as well. This test would still be valuable because it tests that we can make twice the number of calls, due to the counter reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is a little unexpected 👍
Thanks for the review. Those are excellent points.
I agree this is inconvenient. Perhaps it is more clear by renaming the exception to
The value here is that the decorators will explicitly state how many retries can be expected. I like that. We can also achieve that by adding a |
That would help, but it can still be surprising for the reader, when the part where that exception is raised is not close to the decorator where it says “max”. I am thinking from the perspective of somebody who is new to a codebase and is debugging why we make too many calls to some external API.
That might work. Our current |
Following the discussion, I think I agree with @ruuda that we should keep the 2 behaviors (retrying, and throttling) clearly separated. I kind of liked the very first idea of adding a separate
This does leave the caveat that @ruuda pointed out:
But I think it is acceptable. We could make it easier for the developers by clearly stating in the documentation and/or the docstring of the decorators in which order they should go. |
I'm closing this PR as its hard to get the semantics for different parameters right with this approach. |
When looking for documentation whether Channable internally supports making use of the @ruuda does Channable respect the retry header when loading xml feeds? |
You can contact [email protected] for this. |
In cases where requests fail for non-technical reasons,
we often do not want exponential backoff. Notably when
requests are throttled it is more appropriate to wait
a specified amount of time before retrying. Often we
will learn this from a
Retry-After
header.This change introduces
BackoffAndRetryException
which can be used to sleep a fixed amount of time
before retrying.
fixes: #10