Skip to content

Latest commit

 

History

History
454 lines (368 loc) · 27.2 KB

rate-limiting.md

File metadata and controls

454 lines (368 loc) · 27.2 KB

Kuadrant Rate Limiting

A Kuadrant RateLimitPolicy custom resource, often abbreviated "RateLimitPolicy":

  1. Targets Gateway API networking resources such as HTTPRoutes and Gateways, using these resources to obtain additional context, i.e., which traffic workload (HTTP attributes, hostnames, user attributes, etc) to rate limit.
  2. Supports targeting subsets (sections) of a network resource to apply the limits to.
  3. Abstracts the details of the underlying Rate Limit protocol and configuration resources, that have a much broader remit and surface area.
  4. Enables cluster operators to set defaults that govern behavior at the lower levels of the network, until a more specific policy is applied.

How it works

Envoy's Rate Limit Service Protocol

Kuadrant's Rate Limit implementation relies on the Envoy's Rate Limit Service (RLS) protocol. The workflow per request goes:

  1. On incoming request, the gateway checks the matching rules for enforcing rate limits, as stated in the RateLimitPolicy custom resources and targeted Gateway API networking objects
  2. If the request matches, the gateway sends one RateLimitRequest to the external rate limiting service ("Limitador").
  3. The external rate limiting service responds with a RateLimitResponse back to the gateway with either an OK or OVER_LIMIT response code.

A RateLimitPolicy and its targeted Gateway API networking resource contain all the statements to configure both the ingress gateway and the external rate limiting service.

The RateLimitPolicy custom resource

Overview

The RateLimitPolicy spec includes, basically, two parts:

  • A reference to an existing Gateway API resource (spec.targetRef)
  • Limit definitions (spec.limits)

Each limit definition includes:

  • A set of rate limits (spec.limits.<limit-name>.rates[])
  • (Optional) A set of dynamic counter qualifiers (spec.limits.<limit-name>.counters[])
  • (Optional) A set of additional dynamic conditions to activate the limit (spec.limits.<limit-name>.when[])

The limit definitions (limits) can be declared at the top-level level of the spec (with the semantics of defaults) or alternatively within explicit defaults or overrides blocks.

Check out Kuadrant RFC 0002 to learn more about the Well-known Attributes that can be used to define counter qualifiers (counters) and conditions (when).

High-level example and field definition

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: my-rate-limit-policy
spec:
  # Reference to an existing networking resource to attach the policy to. REQUIRED.
  # It can be a Gateway API HTTPRoute or Gateway resource.
  # It can only refer to objects in the same namespace as the RateLimitPolicy.
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute / Gateway
    name: myroute / mygateway

  # The limits definitions to apply to the network traffic routed through the targeted resource.
  # Equivalent to if otherwise declared within `defaults`.
  limits:
    "my_limit":
      # The rate limits associated with this limit definition. REQUIRED.
      # E.g., to specify a 50rps rate limit, add `{ limit: 50, duration: 1, unit: secod }`
      rates: […]

      # Counter qualifiers.
      # Each dynamic value in the data plane starts a separate counter, combined with each rate limit.
      # E.g., to define a separate rate limit for each user name detected by the auth layer, add `metadata.filter_metadata.envoy\.filters\.http\.ext_authz.username`.
      # Check out Kuadrant RFC 0002 (https://github.com/Kuadrant/architecture/blob/main/rfcs/0002-well-known-attributes.md) to learn more about the Well-known Attributes that can be used in this field.
      counters: […]

      # Additional dynamic conditions to trigger the limit.
      # Use it for filtering attributes not supported by HTTPRouteRule or with RateLimitPolicies that target a Gateway.
      # Check out Kuadrant RFC 0002 (https://github.com/Kuadrant/architecture/blob/main/rfcs/0002-well-known-attributes.md) to learn more about the Well-known Attributes that can be used in this field.
      when: […]

    # Explicit defaults. Used in policies that target a Gateway object to express default rules to be enforced on
    # routes that lack a more specific policy attached to.
    # Mutually exclusive with `overrides` and with declaring `limits` at the top-level of the spec.
    defaults:
      limits: {…}

    # Overrides. Used in policies that target a Gateway object to be enforced on all routes linked to the gateway,
    # thus also overriding any more specific policy occasionally attached to any of those routes.
    # Mutually exclusive with `defaults` and with declaring `limits` at the top-level of the spec.
    overrides:
      limits: {…}

Using the RateLimitPolicy

Targeting a HTTPRoute networking resource

When a RateLimitPolicy targets a HTTPRoute, the policy is enforced to all traffic routed according to the rules and hostnames specified in the HTTPRoute, across all Gateways referenced in the spec.parentRefs field of the HTTPRoute.

Target a HTTPRoute by setting the spec.targetRef field of the RateLimitPolicy as follows:

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: <RateLimitPolicy name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: <HTTPRoute Name>
  limits: {…}

Rate limit policy targeting a HTTPRoute resource

Hostnames and wildcards

If a RateLimitPolicy targets a route defined for *.com and another RateLimitPolicy targets another route for api.com, the Kuadrant control plane will not merge these two RateLimitPolicies. Unless one of the policies declare an overrides set of limites, the control plane will configure to mimic the behavior of gateway implementation by which the "most specific hostname wins", thus enforcing only the corresponding applicable policies and limit definitions.

E.g., by default, a request coming for api.com will be rate limited according to the rules from the RateLimitPolicy that targets the route for api.com; while a request for other.com will be rate limited with the rules from the RateLimitPolicy targeting the route for *.com.

See more examples in Overlapping Gateway and HTTPRoute RateLimitPolicies.

Targeting a Gateway networking resource

A RateLimitPolicy that targets a Gateway can declare a block of defaults (spec.defaults) or a block of overrides (spec.overrides). As a standard, gateway policies that do not specify neither defaults nor overrides, act as defaults.

When declaring defaults, a RateLimitPolicy which targets a Gateway will be enforced to all HTTP traffic hitting the gateway, unless a more specific RateLimitPolicy targeting a matching HTTPRoute exists. Any new HTTPRoute referrencing the gateway as parent will be automatically covered by the default RateLimitPolicy, as well as changes in the existing HTTPRoutes.

Defaults provide cluster operators with the ability to protect the infrastructure against unplanned and malicious network traffic attempt, such as by setting safe default limits on hostnames and hostname wildcards.

Inversely, a gateway policy that specify overrides declares a set of rules to be enforced on all routes attached to the gateway, thus atomically replacing any more specific policy occasionally attached to any of those routes.

Target a Gateway HTTPRoute by setting the spec.targetRef field of the RateLimitPolicy as follows:

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: <RateLimitPolicy name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: <Gateway Name>
  defaults: # alternatively: `overrides`
    limits: {…}

rate limit policy targeting a Gateway resource

Overlapping Gateway and HTTPRoute RateLimitPolicies

Two possible semantics are to be considered here – gateway policy defaults vs gateway policy overrides.

Gateway RateLimitPolicies that declare defaults (or alternatively neither defaults nor overrides) protect all traffic routed through the gateway except where a more specific HTTPRoute RateLimitPolicy exists, in which case the HTTPRoute RateLimitPolicy prevails.

Example with 4 RateLimitPolicies, 3 HTTPRoutes and 1 Gateway default (plus 2 HTTPRoute and 2 Gateways without RateLimitPolicies attached):

  • RateLimitPolicy A → HTTPRoute A (a.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy B → HTTPRoute B (b.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy W → HTTPRoute W (*.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy G (defaults) → Gateway G (*.com)

Expected behavior:

  • Request to a.toystore.com → RateLimitPolicy A will be enforced
  • Request to b.toystore.com → RateLimitPolicy B will be enforced
  • Request to other.toystore.com → RateLimitPolicy W will be enforced
  • Request to other.com (suppose a route exists) → RateLimitPolicy G will be enforced
  • Request to yet-another.net (suppose a route and gateway exist) → No RateLimitPolicy will be enforced

Gateway RateLimitPolicies that declare overrides protect all traffic routed through the gateway, regardless of existence of any more specific HTTPRoute RateLimitPolicy.

Example with 4 RateLimitPolicies, 3 HTTPRoutes and 1 Gateway override (plus 2 HTTPRoute and 2 Gateways without RateLimitPolicies attached):

  • RateLimitPolicy A → HTTPRoute A (a.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy B → HTTPRoute B (b.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy W → HTTPRoute W (*.toystore.com) → Gateway G (*.com)
  • RateLimitPolicy G (overrides) → Gateway G (*.com)

Expected behavior:

  • Request to a.toystore.com → RateLimitPolicy G will be enforced
  • Request to b.toystore.com → RateLimitPolicy G will be enforced
  • Request to other.toystore.com → RateLimitPolicy G will be enforced
  • Request to other.com (suppose a route exists) → RateLimitPolicy G will be enforced
  • Request to yet-another.net (suppose a route and gateway exist) → No RateLimitPolicy will be enforced

Limit definition

A limit will be activated whenever a request comes in and the request matches:

  • all of the when conditions specified in the limit.

A limit can define:

  • counters that are qualified based on dynamic values fetched from the request, or
  • global counters (implicitly, when no qualified counter is specified)

A limit is composed of one or more rate limits.

E.g.

spec:
  limits:
    "toystore-all":
      rates:
      - limit: 5000
        duration: 1
        unit: second

    "toystore-api-per-username":
      rates:
      - limit: 100
        duration: 1
        unit: second
      - limit: 1000
        duration: 1
        unit: minute
      counters:
      - auth.identity.username
      when:
      - selector: request.host
        operator: eq
        value: "api.toystore.com"

    "toystore-admin-unverified-users":
      rates:
      - limit: 250
        duration: 1
        unit: second
      when:
      - selector: request.host
        operator: eq
        value: "admin.toystore.com"
      - selector: auth.identity.email_verified
        operator: eq
        value: "false"
Request to Rate limits enforced
api.toystore.com 100rps/username or 1000rpm/username (whatever happens first)
admin.toystore.com 250rps
other.toystore.com 5000rps

when conditions

when conditions can be used to scope a limit (i.e. to filter the traffic to which a limit definition applies) without any coupling to the underlying network topology, i.e. without making direct references to HTTPRouteRules.

Use when conditions to conditionally activate limits based on attributes that cannot be expressed in the HTTPRoutes' spec.hostnames and spec.rules.matches fields, or in general in RateLimitPolicies that target a Gateway.

The selectors within the when conditions of a RateLimitPolicy are a subset of Kuadrant's Well-known Attributes (RFC 0002). Check out the reference for the full list of supported selectors.

Examples

Check out the following user guides for examples of rate limiting services with Kuadrant:

Known limitations

  • One HTTPRoute can only be targeted by one RateLimitPolicy.
  • One Gateway can only be targeted by one RateLimitPolicy.
  • RateLimitPolicies can only target HTTPRoutes/Gateways defined within the same namespace of the RateLimitPolicy.
  • 2+ RateLimitPolicies cannot target network resources that define/inherit the same exact hostname.

Limitation: Multiple network resources with identical hostnames

Kuadrant currently does not support multiple RateLimitPolicies simultaneously targeting network resources that declare identical hostnames. This includes multiple HTTPRoutes that specify the same hostnames in the spec.hostnames field, as well as HTTPRoutes that specify a hostname that is identical to a hostname specified in a listener of one of the route's parent gateways or HTTPRoutes that don't specify any hostname at all thus inheriting the hostnames from the parent gateways. In any of these cases, a maximum of one RateLimitPolicy targeting any of those resources that specify identical hostnames is allowed.

Moreover, having multiple resources that declare identical hostnames may lead to unexpected behavior and therefore should be avoided.

This limitation is rooted at the underlying components configured by Kuadrant for the implementation of its policies and the lack of information in the data plane regarding the exact route that honored by the API gateway in cases of conflicting hostnames.

To exemplify one way this limitation can impact deployments, consider the following topology:

                 ┌──────────────┐
                 │   Gateway    │
                 ├──────────────┤
          ┌─────►│ listeners:   │◄──────┐
          │      │ - host: *.io │       │
          │      └──────────────┘       │
          │                             │
          │                             │
┌─────────┴─────────┐        ┌──────────┴────────┐
│     HTTPRoute     │        │     HTTPRoute     │
│     (route-a)     │        │     (route-b)     │
├───────────────────┤        ├───────────────────┤
│ hostnames:        │        │ hostnames:        │
│ - app.io          │        │ - app.io          │
│ rules:            │        │ rules:            │
│ - matches:        │        │ - matches:        │
│   - path:         │        │   - path:         │
│       value: /foo │        │       value: /bar │
└───────────────────┘        └───────────────────┘
          ▲                            ▲
          │                            │
 ┌────────┴────────┐           ┌───────┴─────────┐
 │ RateLimitPolicy │           │ RateLimitPolicy │
 │   (policy-1)    │           │   (policy-2)    │
 └─────────────────┘           └─────────────────┘

In the example above, with the policy-1 resource created before policy-2, policy-2 will be enforced on all requests to app.io/bar while policy-1 will not be enforced at all. I.e. app.io/foo will not be rate-limited. Nevertheless, both policies will report status condition as Enforced.

Notice the enforcement of policy-2 and no enforcement of policy-1 is the opposite behavior as the analogous problem with the Kuadrant AuthPolicy.

A different way the limitation applies is when two or more routes of a gateway declare the exact same hostname and a gateway policy is defined with expectation to set default rules for the cases not covered by more specific policies. E.g.:

                                    ┌─────────────────┐
                         ┌──────────┤ RateLimitPolicy │
                         │          │    (policy-2)   │
                         ▼          └─────────────────┘
                 ┌──────────────┐
                 │   Gateway    │
                 ├──────────────┤
          ┌─────►│ listeners:   │◄──────┐
          │      │ - host: *.io │       │
          │      └──────────────┘       │
          │                             │
          │                             │
┌─────────┴─────────┐        ┌──────────┴────────┐
│     HTTPRoute     │        │     HTTPRoute     │
│     (route-a)     │        │     (route-b)     │
├───────────────────┤        ├───────────────────┤
│ hostnames:        │        │ hostnames:        │
│ - app.io          │        │ - app.io          │
│ rules:            │        │ rules:            │
│ - matches:        │        │ - matches:        │
│   - path:         │        │   - path:         │
│       value: /foo │        │       value: /bar │
└───────────────────┘        └───────────────────┘
          ▲
          │
 ┌────────┴────────┐
 │ RateLimitPolicy │
 │   (policy-1)    │
 └─────────────────┘

Once again, both policies will report status condition as Enforced. However, in this case, only policy-1 will be enforced on requests to app.io/foo, while policy-2 will not be enforced at all. I.e. app.io/bar will not be not rate-limited. This is same behavior as the analogous problem with the Kuadrant AuthPolicy.

To avoid these problems, use different hostnames in each route.

Implementation details

Driven by limitations related to how Istio injects configuration in the filter chains of the ingress gateways, Kuadrant relies on Envoy's Wasm Network filter in the data plane, to manage the integration with rate limiting service ("Limitador"), instead of the Rate Limit filter.

Motivation: Multiple rate limit domains
The first limitation comes from having only one filter chain per listener. This often leads to one single global rate limiting filter configuration per gateway, and therefore to a shared rate limit domain across applications and policies. Even though, in a rate limit filter, the triggering of rate limit calls, via actions to build so-called "descriptors", can be defined at the level of the virtual host and/or specific route rule, the overall rate limit configuration is only one, i.e., always the same rate limit domain for all calls to Limitador.

On the other hand, the possibility to configure and invoke the rate limit service for multiple domains depending on the context allows to isolate groups of policy rules, as well as to optimize performance in the rate limit service, which can rely on the domain for indexation.

Motivation: Fine-grained matching rules
A second limitation of configuring the rate limit filter via Istio, particularly from Gateway API resources, is that rate limit descriptors at the level of a specific HTTP route rule require "named routes" – defined only in an Istio VirtualService resource and referred in an EnvoyFilter one. Because Gateway API HTTPRoute rules lack a "name" property1, as well as the Istio VirtualService resources are only ephemeral data structures handled by Istio in-memory in its implementation of gateway configuration for Gateway API, where the names of individual route rules are auto-generated and not referable by users in a policy23, rate limiting by attributes of the HTTP request (e.g., path, method, headers, etc) would be very limited while depending only on Envoy's Rate Limit filter.

Motivated by the desire to support multiple rate limit domains per ingress gateway, as well as fine-grained HTTP route matching rules for rate limiting, Kuadrant implements a wasm-shim that handles the rules to invoke the rate limiting service, complying with Envoy's Rate Limit Service (RLS) protocol.

The wasm module integrates with the gateway in the data plane via Wasm Network filter, and parses a configuration composed out of user-defined RateLimitPolicy resources by the Kuadrant control plane. Whereas the rate limiting service ("Limitador") remains an implementation of Envoy's RLS protocol, capable of being integrated directly via Rate Limit extension or by Kuadrant, via wasm module for the Istio Gateway API implementation.

As a consequence of this design:

  • Users can define fine-grained rate limit rules that match their Gateway and HTTPRoute definitions including for subsections of these.
  • Rate limit definitions are insulated, not leaking across unrelated policies or applications.
  • Conditions to activate limits are evaluated in the context of the gateway process, reducing the gRPC calls to the external rate limiting service only to the cases where rate limit counters are known in advance to have to be checked/incremented.
  • The rate limiting service can rely on the indexation to look up for groups of limit definitions and counters.
  • Components remain compliant with industry protocols and flexible for different integration options.

A Kuadrant wasm-shim configuration for one RateLimitPolicy custom resources targeting a HTTPRoute looks like the following and it is generated automatically by the Kuadrant control plane:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  creationTimestamp: "2024-10-01T16:59:40Z"
  generation: 1
  name: kuadrant-kuadrant-ingressgateway
  namespace: gateway-system
  ownerReferences:
  - apiVersion: gateway.networking.k8s.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: Gateway
    name: kuadrant-ingressgateway
    uid: 0298355b-fb30-4442-af2b-88d0c05bd2bd
  resourceVersion: "11253"
  uid: 36ef1fb7-9eca-46c7-af63-fe783f40148c
spec:
  phase: STATS
  pluginConfig:
    extensions:
      limitador:
        endpoint: kuadrant-rate-limiting-service
        failureMode: allow
        type: ratelimit
    policies:
    - hostnames:
      - '*.toystore.website'
      - '*.toystore.io'
      name: gateway-system/app-rlp
      rules:
      - actions:
        - data:
          - static:
              key: limit.toystore_assets_all_domains__b61ee8e6
              value: "1"
          extension: limitador
          scope: gateway-system/app-rlp
        conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /assets
          - operator: endswith
            selector: request.host
            value: .toystore.website
      - actions:
        - data:
          - static:
              key: limit.toystore_v1_website_unauthenticated__377837ee
              value: "1"
          extension: limitador
          scope: gateway-system/app-rlp
        conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /v1
          - operator: endswith
            selector: request.host
            value: .toystore.website
          - operator: eq
            selector: auth.identity.username
            value: ""
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /assets
          - operator: endswith
            selector: request.host
            value: .toystore.website
          - operator: eq
            selector: auth.identity.username
            value: ""
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: kuadrant-ingressgateway
  url: oci://quay.io/kuadrant/wasm-shim:latest

Footnotes

  1. https://github.com/kubernetes-sigs/gateway-api/pull/996

  2. https://github.com/istio/istio/issues/36790

  3. https://github.com/istio/istio/issues/37346