[Security Solution] Pagination is broken in the alerts table #201913

MadameSheema · 2024-11-27T09:20:10Z

Describe the bug:

Pagination is broken in the alerts table

Kibana/Elasticsearch Stack version:
8.17.0 - BC1

Initial setup:

To have a big amount of alerts generated. In my case, 12.187 alerts.

Steps to reproduce:

Navigate to the alerts page
Click the last pagination number, in my case, 244.

Current behavior:

An error is displayed

Result window is too large, from + size must be less than or equal to: [10000] but was [12200]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.

The alerts are not displayed

Expected behavior:

No error should be displayed
Alerts should be displayed

elasticmachine · 2024-11-27T09:20:12Z

Pinging @elastic/security-solution (Team: SecuritySolution)

elasticmachine · 2024-11-27T09:20:12Z

Pinging @elastic/security-threat-hunting (Team:Threat Hunting)

elasticmachine · 2024-11-27T09:20:13Z

Pinging @elastic/security-threat-hunting-investigations (Team:Threat Hunting:Investigations)

PhilippeOberti · 2024-11-27T22:22:43Z

This is related to the 10k limit that ES has. It happens not only on the last page, but on the first page after you reach 10k elements in the table. For example in the video below, I had 100 elements per page, as soon as I reach page 101 the error appears.

Screen.Recording.2024-11-27.at.4.18.24.PM.mov

PhilippeOberti · 2024-11-27T22:31:17Z

Here's the payload of the call being made

{
  "featureIds": [
    "siem"
  ],
  "fields": [
    {
      "field": "@timestamp",
      "include_unmapped": true
    },
    {
      "field": "kibana.alert.rule.name",
      "include_unmapped": true
    },
    {
      "field": "kibana.alert.workflow_assignee_ids",
      "include_unmapped": true
    },
    {
      "field": "kibana.alert.severity",
      "include_unmapped": true
    },
    {
      "field": "kibana.alert.risk_score",
      "include_unmapped": true
    },
    {
      "field": "kibana.alert.reason",
      "include_unmapped": true
    },
    {
      "field": "host.name",
      "include_unmapped": true
    },
    {
      "field": "user.name",
      "include_unmapped": true
    },
    {
      "field": "host.risk.calculated_level",
      "include_unmapped": true
    },
    {
      "field": "user.risk.calculated_level",
      "include_unmapped": true
    },
    {
      "field": "host.asset.criticality",
      "include_unmapped": true
    },
    {
      "field": "user.asset.criticality",
      "include_unmapped": true
    },
    {
      "field": "process.name",
      "include_unmapped": true
    },
    {
      "field": "file.name",
      "include_unmapped": true
    },
    {
      "field": "source.ip",
      "include_unmapped": true
    },
    {
      "field": "destination.ip",
      "include_unmapped": true
    }
  ],
  "query": {
    "bool": {
      "filter": {
        "bool": {
          "must": [],
          "filter": [
            {
              "match_phrase": {
                "kibana.alert.workflow_status": "open"
              }
            },
            {
              "range": {
                "@timestamp": {
                  "gte": "2024-11-27T06:00:00.000Z",
                  "lte": "2024-11-28T05:59:59.999Z",
                  "format": "strict_date_optional_time"
                }
              }
            }
          ],
          "should": [],
          "must_not": [
            {
              "exists": {
                "field": "kibana.alert.building_block_type"
              }
            }
          ]
        }
      }
    }
  },
  "pagination": {
    "pageIndex": 100,
    "pageSize": 100
  },
  "sort": [
    {
      "@timestamp": {
        "order": "desc"
      }
    }
  ],
  "runtimeMappings": {},
  "isSearchStored": false,
  "stream": false
}

and here's the error coming back from the backend

{
    "statusCode": 400,
    "error": "Bad Request",
    "message": "status_exception\n\tCaused by:\n\t\tsearch_phase_execution_exception: all shards failed",
    "attributes": {
        "error": {
            "type": "status_exception",
            "reason": "error while executing search",
            "caused_by": {
                "type": "search_phase_execution_exception",
                "reason": "all shards failed",
                "phase": "query",
                "grouped": true,
                "failed_shards": [
                    {
                        "shard": 0,
                        "index": ".internal.alerts-security.alerts-default-000001",
                        "node": "nDFzgFmYRvmk4IQhJ4zftw",
                        "reason": {
                            "type": "illegal_argument_exception",
                            "reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
                        }
                    }
                ],
                "caused_by": {
                    "type": "illegal_argument_exception",
                    "reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.",
                    "caused_by": {
                        "type": "illegal_argument_exception",
                        "reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
                    }
                }
            }
        },
        "rawResponse": {
            "took": 15,
            "timed_out": false,
            "terminated_early": false,
            "num_reduce_phases": 0,
            "_shards": {
                "total": 1,
                "successful": 0,
                "skipped": 0,
                "failed": 1,
                "failures": [
                    {
                        "shard": 0,
                        "index": ".internal.alerts-security.alerts-default-000001",
                        "node": "nDFzgFmYRvmk4IQhJ4zftw",
                        "reason": {
                            "type": "illegal_argument_exception",
                            "reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [10100]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
                        }
                    }
                ]
            },
            "hits": {
                "total": {
                    "value": 0,
                    "relation": "gte"
                },
                "max_score": null,
                "hits": []
            }
        },
        "requestParams": {
            "method": "POST",
            "path": "/.alerts-security.alerts-default/_async_search",
            "querystring": "batched_reduce_size=64&ccs_minimize_roundtrips=true&wait_for_completion_timeout=200ms&keep_on_completion=false&keep_alive=60000ms&ignore_unavailable=true&allow_no_indices=true"
        }
    }
}

logeekal · 2024-11-29T10:39:50Z

This is related to the 10k limit that ES has. It happens not only on the last page, but on the first page after you reach 10k elements in the table. For example in the video below, I had 100 elements per page, as soon as I reach page 101 the error appears.

@PhilippeOberti , I agree that it is an ES limit but they also provide an alternative to avoid this problem.

I think alternative could be to cap the results at 10000 instead of giving an error.

@elastic/response-ops team, i think this will be affecting all the consumers of alert table because of how privateRuleRegistryAlertsSearchStrategy. Should we plan to include scroll API in privateRuleRegistryAlertsSearchStrategy ?

cnasikas · 2024-12-18T12:18:49Z

Hey all. Sorry for the late reply. I agree that we should cap the results to 10K instead of showing an error. I would suggest not using the Scroll API or the Search after API as ES does not recommend the first one and the second one does not work with pagination (you cannot get results by page or perPage). We can show a warning banner to the users that only 10K alerts are being shown, and if they want to view more, they should narrow their search criteria. We follow this pattern in cases and the rule's execution log.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution] Pagination is broken in the alerts table #201913

[Security Solution] Pagination is broken in the alerts table #201913

MadameSheema commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

PhilippeOberti commented Nov 27, 2024 •

edited

Loading

PhilippeOberti commented Nov 27, 2024

logeekal commented Nov 29, 2024 •

edited

Loading

cnasikas commented Dec 18, 2024 •

edited

Loading

[Security Solution] Pagination is broken in the alerts table #201913

[Security Solution] Pagination is broken in the alerts table #201913

Comments

MadameSheema commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

elasticmachine commented Nov 27, 2024

PhilippeOberti commented Nov 27, 2024 • edited Loading

PhilippeOberti commented Nov 27, 2024

logeekal commented Nov 29, 2024 • edited Loading

cnasikas commented Dec 18, 2024 • edited Loading

PhilippeOberti commented Nov 27, 2024 •

edited

Loading

logeekal commented Nov 29, 2024 •

edited

Loading

cnasikas commented Dec 18, 2024 •

edited

Loading