Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] [Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501) #198035

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

kibanamachine
Copy link
Contributor

Backport

This will backport the following commits from main to 8.x:

Questions ?

Please refer to the Backport tool documentation

… backpressure mechanism responds correctly (elastic#197501)

Resolves elastic/response-ops-team#240

## Summary

Creating an `MsearchError` class that preserves the status code from any
msearch errors. These errors are already piped to the managed
configuration observable that watches for and responds to ES errors from
the update by query claim strategy so I updated that filter to filter
for msearch 429 and 503 errors as well.

## To Verify

1. Make sure you're using the mget claim strategy
(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.
2. Inject a 429 error into an msearch response.

```
--- a/x-pack/plugins/task_manager/server/task_store.ts
+++ b/x-pack/plugins/task_manager/server/task_store.ts
@@ -571,6 +571,8 @@ export class TaskStore {
     });
     const { responses } = result;

+    responses[0].status = 429;
+
     const versionMap = this.createVersionMap([]);
```

3. See task manager log the msearch errors and eventually reduce polling
capacity

```
[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429
[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429
[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429
[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429
...

[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 "too many request" and/or "execute [inline] script" error(s).
[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 "too many request" and/or "execute [inline] script" error(s).
```

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit 043e18b)
@kibanamachine kibanamachine merged commit f9c8127 into elastic:8.x Oct 28, 2024
34 checks passed
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

✅ unchanged

cc @ymao1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants