Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better support for throttling exceptions #11

Closed
wants to merge 1 commit into from

Conversation

jochemb
Copy link
Contributor

@jochemb jochemb commented Oct 2, 2020

In cases where requests fail for non-technical reasons,
we often do not want exponential backoff. Notably when
requests are throttled it is more appropriate to wait
a specified amount of time before retrying. Often we
will learn this from a Retry-After header.

This change introduces BackoffAndRetryException
which can be used to sleep a fixed amount of time
before retrying.

fixes: #10

In cases where requests fail for non-technical reasons,
we often do not want exponential backoff. Notably when
requests are throttled it is more appropriate to wait
a specified amount of time before retrying. Often we
will learn this from a `Retry-After` header.

This change introduces `BackoffAndRetryException`
which can be used to sleep a fixed amount of time
before retrying.
@jochemb jochemb requested review from ruuda and a team October 2, 2020 08:17
Copy link
Contributor

@ruuda ruuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing this, and resetting the retry window and count looked like a neat solution initially, but then I realized it breaks the naming of our arguments, which is exactly the kind of footgun we try to avoid in Opnieuw: max_calls_total is no longer a maximum if the count can reset.

There is also one potentially dangerous outcome: if you happen to hit an edge case in an external service, so it always fails, but there is also an aggressive rate limit on the endpoint, then we can end up in an infinite loop, where before we hit the maximum number of failures we get rate limited, and after that we start running into the same errors.

I don’t think this is a hypothetical problem. We integrate with one API in particular that has rate limits of a few calls per hour, and I’ve seen us consistently getting internal server errors for one particular request.

Making retries on throttle a separate decorator would fix both problems (we can put a maximum on the number of attempts there as well, and the meaning of the inner retry remains unchanged), but it introduces some new challenges:

  • You can forget to add it.
  • You can put the throttle-retry and error-retry in the wrong order.

I think the latter one could be prevented with some properties and an assertion. The former one is not so bad, because rate limiting is really a separate thing from transient errors, and we already need to detect the case specially anyway.

What do you think?

self.counter = 0
start = time.monotonic()

with retry_immediately():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so this does mean that backoff throttle waits are not included in the retry_immediately. I would expect them to be skipped as well. This test would still be valuable because it tests that we can make twice the number of calls, due to the counter reset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is a little unexpected 👍

@jochemb
Copy link
Contributor Author

jochemb commented Oct 2, 2020

Thanks for the review. Those are excellent points.

I realized it breaks the naming of our arguments, which is exactly the kind of footgun we try to avoid in Opnieuw: max_calls_total is no longer a maximum if the count can reset.

I agree this is inconvenient. Perhaps it is more clear by renaming the exception to BackoffAndResetRetryException

Making retries on throttle a separate decorator would fix both problems (we can put a maximum on the number of attempts there as well, and the meaning of the inner retry remains unchanged)

The value here is that the decorators will explicitly state how many retries can be expected. I like that. We can also achieve that by adding a maximum_number_of_resets parameter to the decorator. We can default that to 0 in which case BackoffAndResetRetryException would function just as RetryException. This way we have explicitness, a way to prevent infinite loops, but no confusion about decorator ordering. Someone might still forget to set the parameter to non-zero though, but I think that is acceptable.

@ruuda
Copy link
Contributor

ruuda commented Oct 5, 2020

Perhaps it is more clear by renaming the exception to BackoffAndResetRetryException

That would help, but it can still be surprising for the reader, when the part where that exception is raised is not close to the decorator where it says “max”. I am thinking from the perspective of somebody who is new to a codebase and is debugging why we make too many calls to some external API.

We can also achieve that by adding a maximum_number_of_resets parameter to the decorator.

That might work. Our current max_calls_total argument is pretty strongly a maximum on the total number of calls though, even if there is something called “reset” beside it. 🙈 How about we keep max_calls_total as the total upper bound, and add a new optional one for the max calls between resets?

@p-nilsson
Copy link

Following the discussion, I think I agree with @ruuda that we should keep the 2 behaviors (retrying, and throttling) clearly separated. I kind of liked the very first idea of adding a separate @throttle decorator to opnieuw for 2 reasons:

  1. Nothing changes for the people who are currently used to using @retry and @retry_async.
  2. We have clearly separated the use of throttling and retrying and it is up to the developer to add the decorators they want to utilize

This does leave the caveat that @ruuda pointed out:

You can put the throttle-retry and error-retry in the wrong order.

But I think it is acceptable. We could make it easier for the developers by clearly stating in the documentation and/or the docstring of the decorators in which order they should go.

@svisser svisser removed the request for review from a team August 30, 2021 19:08
@jochemb
Copy link
Contributor Author

jochemb commented Sep 6, 2021

I'm closing this PR as its hard to get the semantics for different parameters right with this approach.

@jochemb jochemb closed this Sep 6, 2021
@p-nilsson p-nilsson deleted the backoff-exception branch September 6, 2021 13:20
@rvanlaak
Copy link

When looking for documentation whether Channable internally supports making use of the Retry-After response header, I came across this PR.

@ruuda does Channable respect the retry header when loading xml feeds?

@ruuda
Copy link
Contributor

ruuda commented Apr 21, 2022

Does Channable respect the retry header when loading xml feeds?

You can contact [email protected] for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add special support for rate limiting
4 participants