You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Fleet] cancel tasks when 3rd retry failed (elastic#147190)
## Summary
Related to elastic#144161
Found that on a bulk update tags task failure, the task didn't stop
after 3 retries (should be over in less then a minute), the retries kept
happening for 2 hours.
This change removes the retry task if 3 retries are reached.
Also testing in cloud deployment to see if the tags error can be
reproduced with this fix.
I could reproduce the reported error locally, and seeing it goes away
with this fix.
To verify:
- Add at least 50k agents with the `create_agents` script in kibana repo
- open Kibana, select the 50k agents, and open Actions / Add tags
- Try this in a few seconds: add 2 new tags, and remove one of them
- Wait about 30s, the agents should reflect the changes
- Check the logs to see that the tasks are removed after 3rd retry is
reached or successful.
- Check that there are no more running tasks. Any running task can be
found in Kibana Console by running this query: `GET
.kibana_task_manager/_search?q=task.taskType:"fleet:update_agent_tags:retry"`
Locally simulated an error to test that the retry (and check) task is
removed:
```
[2022-12-07T15:52:16.415+01:00][ERROR][plugins.fleet] Retry #3 of task fleet:update_agent_tags:retry:848984ab-c11d-4ebe-8d1f-606143dd656b failed: failing task
[2022-12-07T15:52:16.416+01:00][WARN ][plugins.fleet] Stopping after 3rd retry. Error: failing task
[2022-12-07T15:52:16.416+01:00][INFO ][plugins.fleet] Removing task fleet:update_agent_tags:retry:check:848984ab-c11d-4ebe-8d1f-606143dd656b
[2022-12-07T15:52:16.416+01:00][INFO ][plugins.fleet] Removing task fleet:update_agent_tags:retry:848984ab-c11d-4ebe-8d1f-606143dd656b
```
0 commit comments