[draft] Dynamically adjust macaroon limiter based on errors from storage nodes #532

halkyon · 2024-12-16T06:33:58Z

Storage nodes have the possibility of limiting the number of in-flight requests they can handle at once using the STORJ_STORAGE2_MAX_CONCURRENT_REQUESTS config (storage2.max-concurrent-requests).

gateway-mt has a macaroon limiter to limit concurrent requests. This limit is staticly set, but we've seen a need to dynamically adjust this limit as storage network conditions change. One idea is we adjust it based on errors coming back from storage nodes.

We could take inspiration from a TCP congestion control algorithm like AIMD. Apply a multiplicative decrease (to a defined minimum) on macaroon limits when we start seeing more nodes return a limit error. Increase it (to a defined max limit) as nodes no longer return the error. Perhaps we have a timer where we watch the number of limit errors in a given window in order to decide if limits should be adjusted.

To consider:

Should we handle stall detection logic in the gateway? https://review.dev.storj.io/c/storj/uplink/+/15489 introduces a stall manager to uplink in response to other cases where storage nodes are stalling on uploads, but maybe it makes sense for this logic to move closer to congestion control itself, like decrease limits if we start seeing stalling
Keep track of round-trip times (on the individual storage node piece operation level?) as another signal for congestion?

References:

https://ee.lbl.gov/papers/congavoid.pdf
https://en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease
https://www.geeksforgeeks.org/aimd-algorithm/
https://github.com/platinummonkey/go-concurrency-limits (based off Netflix/go-concurrency-limits) has some interesting measurement examples

The text was updated successfully, but these errors were encountered:

amwolff · 2024-12-16T09:28:07Z

Related issues:

halkyon changed the title ~~Dymamically adjust macaroon limiter based on errors from storage nodes~~ [draft] Dymamically adjust macaroon limiter based on errors from storage nodes Dec 16, 2024

halkyon changed the title ~~[draft] Dymamically adjust macaroon limiter based on errors from storage nodes~~ [draft] Dynamically adjust macaroon limiter based on errors from storage nodes Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[draft] Dynamically adjust macaroon limiter based on errors from storage nodes #532

[draft] Dynamically adjust macaroon limiter based on errors from storage nodes #532

halkyon commented Dec 16, 2024 •

edited

Loading

amwolff commented Dec 16, 2024

[draft] Dynamically adjust macaroon limiter based on errors from storage nodes #532

[draft] Dynamically adjust macaroon limiter based on errors from storage nodes #532

Comments

halkyon commented Dec 16, 2024 • edited Loading

amwolff commented Dec 16, 2024

halkyon commented Dec 16, 2024 •

edited

Loading