-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add special support for rate limiting #10
Comments
I like it! As you mentioned, we run into this often. |
This is great idea! There is another case where we need to do retries, but where Opnieuw is not well suited. We occasionally need to wait for a file to be available a destination URL. This could take as much as 20 minutes and results in In terms of implementation I think differences between these two are pretty subtle: @throttle
@retry(retry_on_exceptions=CONNECTION_ERRORS_AND_INTERNAL_SERVER_ERRORS)
def call_external_service() -> Result:
result = do_external_thing()
if result.is_rate_limit():
raise RateLimitError(sleep_seconds=result.suggested_sleep_time_seconds)
@retry(retry_on_exceptions=CONNECTION_ERRORS_AND_INTERNAL_SERVER_ERRORS)
@throttle
def call_external_service() -> Result:
result = do_external_thing()
if result.is_rate_limit():
raise RateLimitError(sleep_seconds=result.suggested_sleep_time_seconds) Perhaps it makes sense to expose the behavior for both retries and sleeps in a single decorator as well. That way end users don't have to think about decorator ordering. |
That’s a good point! You definitely want the throttle to go on the outside, otherwise the retry window for connection errors expires quickly. We could have a combined decorator that internally uses |
To me this issue has a pretty clear solution direction, except for:
@ruuda What would you want it to be able to measure? Also, how should it report these measurements? Do we want it to simply log messages about these statistics or do we define some protocol so we can add callbacks that write things to the database? |
This is very roughly what I had in mindimport time
from typing import Callable, List, NamedTuple
F = Callable[[int], int]
class Call(NamedTuple):
begin_second: float
success: bool
def estimate_delay_seconds(calls: List[Call]) -> float:
"""You can go very deep statistically on this, but this will do for the proof of concept."""
delay_seconds_good: List[float] = []
delay_seconds_bad: List[float] = [0.0]
last_success_second = calls[0].begin_second
for call in calls[1:]:
if call.success:
delay_seconds_good.append(call.begin_second - last_success_second)
last_success_second = call.begin_second
else:
delay_seconds_bad.append(call.begin_second - last_success_second)
if len(delay_seconds_good) > 1:
return 0.5 * max(delay_seconds_bad) + 0.5 * min(delay_seconds_good)
else:
return 0.1 + max(delay_seconds_bad) * 2.0
def throttle(f: F) -> F:
calls: List[Call] = []
def apply(x: int) -> int:
if len(calls) >= 2:
successes = [call for call in calls if call.success]
last_success_second = successes[-1].begin_second if len(successes) > 0 else calls[0].begin_second
delay_seconds = estimate_delay_seconds(calls)
next_call_second = last_success_second + delay_seconds
sleep_seconds = max(0.0, next_call_second - time.monotonic())
if sleep_seconds > 0.0:
print(f'Observed success rate: {1.0 / delay_seconds:.1f} Hz.')
print(f'Sleeping {sleep_seconds:.1f} s to avoid rate limit.')
time.sleep(sleep_seconds)
begin_second = time.monotonic()
try:
result = f(x)
calls.append(Call(begin_second, success=True))
return result
except:
calls.append(Call(begin_second, success=False))
raise
return apply
last_call_second: float = time.monotonic()
@throttle
def succ_limited(x: int) -> int:
global last_call_second
now_second = time.monotonic()
if now_second - last_call_second > 1.0:
last_call_second = now_second
return x + 1
else:
raise Exception('Too soon, only one call per second allowed.')
for i in range(20):
try:
succ_limited(i)
print(f'Call {i} succeeded')
except Exception as exc:
print(f'Call {i} failed: {exc}') Output
So it “learns” to throttle itself to avoid rate limits, even when the exact way limiting works is a black box. I think it could be useful, but this might be too much magic, maybe it is better to have some kind of limiter class that can be used in conjunction with a retry decorator to estimate a delay, rather than building it into the decorator itself. |
Oh that is a very cool approach to learning undocumented rate limits! We could definitely use something like as general case for endpoints where we run into occasional rate limit errors. It is curious your example fails here:
I think two cases where we see a lot of problems now are not covered very well with this approach:
|
I think it’s because I print the values rounded to one decimal, but the actual time it slept is slightly below 1s.
Yeah, after thinking about it some more this definitely should not go in the decorator, at best it can be a module to help determine how long to wait, but in almost all of our use cases we need to have special handling for the particular way of rate limiting anyway. |
I was writing an example for our new Delta API, and 1/3 of the code is related to slowing down on rate limiting, which clutters the example a lot. I will extract that part and open a pull request here. |
In places where we use
@retry
, we often also use it to retry on rate limit errors, usually by detecting the rate limiting and raising aRetryException
. This mostly works, but it is suboptimal for a few reasons:I think Opnieuw would be a good place to add such functionality. Then we could decorate our calls like:
So a new decorator (I called it
@throttle
here, better names are welcome) would take care of rate limiting, and it would trigger onRateLimitError
.RateLimitError
could communicate any information about when the limit will reset back to the decorator.I think it would even be possible to give the decorator some local state, so it can automatically measure success rate across multiple calls to
call_external_service
.@jochemb @wesleybowman what do you think?
The text was updated successfully, but these errors were encountered: