Throttling #4842
Replies: 4 comments
-
From a query perspective, it is hard to calculate the complexity as some fields could be backed by a field resolver or dataloader that needs to do an additional database lookup. The github approach is good, but it requires us to add pagination everywhere. Also, from their calculation, it seems like they're not using dataloaders. I'm looking for some kind of throttling like this but not sure how to proceed. |
Beta Was this translation helpful? Give feedback.
-
@michaelstaib can we close this? |
Beta Was this translation helpful? Give feedback.
-
Is it still in Feature Request ? |
Beta Was this translation helpful? Give feedback.
-
I believe this can be done with the existing operation complexity calculations in combination with Redis as mentioned here: https://chillicream.com/docs/hotchocolate/v13/security/operation-complexity |
Beta Was this translation helpful? Give feedback.
-
This feature shall provide more security options to the execution engine.
The solutions we’ve seen so far are great to stop abusive queries from taking your servers down. The problem with using them alone like this is that they will stop large queries, but won’t stop clients that are making a lot of medium sized queries!
In most APIs, a simple throttle is used to stop clients from requesting resources too often. GraphQL is a bit special because throttling on the number of requests does not really help us. Even a few queries might be too much if they are very large.
In fact, we have no idea what amount of requests is acceptable since they are defined by the clients. So what can we use to throttle clients?
Throttling Based on Server Time
A good estimate of how expensive a query is the server time it needs to complete. We can use this heuristic to throttle queries. With a good knowledge of your system, you can come up with a maximum server time a client can use over a certain time frame.
We also decide on how much server time is added to a client over time. This is a classic leaky bucket algorithm. Note that there are other throttling algorithms out there, but they are out of scope for this chapter. We will use a leaky bucket throttle in the next examples.
Let’s imagine our maximum server time (Bucket Size) allowed is set to
1000ms
, that clients gain100ms
of server time per second (Leak Rate) and this mutation:takes on average
200ms to complete. In reality, the time may vary but we’ll assume it always takes
200ms` to complete for the sake of this example.It means that a client calling this operation more than 5 times within 1 second would be blocked until more available server time is added to the client.
After two seconds (
100ms
is added by second), our client could call thecreatePost
a single time.As you can see, throttling based on time is a great way to throttle GraphQL queries since complex queries will end up consuming more time meaning you can call them less often, and smaller queries
may be called more often since they will be very fast to compute.
It can be good to express these throttling constraints to clients if your GraphQL API is public. In that case, server time is not always the easiest thing to express to clients, and clients cannot really estimate what time their queries will take without trying them first.
Remember the Max Complexity we talked about earlier? What if we throttled based on that instead?
Throttling Based on Query Complexity
Throttling based on Query Complexity is a great way to work with clients and help them respect the limits of your schema.
Let’s use the same complexity example we used in the Query Complexity section:
We know that this query has a cost 3 based on complexity. Just like a time throttle, we can come up with a maximum cost (Bucket Size) per time a client can use.
With a maximum cost of 9, our clients could run this query only three times, before the leak rate forbids them to query more.
The principles are the same as our time throttle, but now communicating these limits to clients is much nicer. Clients can even calculate the costs of their queries themselves without needing to estimate server time!
The GitHub public API actually uses this approach to throttle their clients. Take a look at how they express these limits to users: https://developer.github.com/v4/guides/resource-limitations/.
Summary
GraphQL is great to use for clients because it gives them so much more power. But that power also gives them the possibility to abuse your GraphQL server with very expensive queries.
There are many approaches to secure your GraphQL server against these queries, but none of them is bullet proof. It’s important to know what options are available and know their limits so we take the best decisions!
https://www.howtographql.com/advanced/4-security/
Beta Was this translation helpful? Give feedback.
All reactions