-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase the Windows OnFailureDelayDuration delay to 15s #3657
Conversation
This matches the value that endpoint uses and helps mitigate bugs where agent unexpectedly restarts during a system shutdown.
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
🌐 Coverage report
|
Arg |
buildkite test it |
...s/1698259940-Increase-wait-period-between-service-restarts-on-failure-to-15s-on-Windows.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…rvice-restarts-on-failure-to-15s-on-Windows.yaml Co-authored-by: Paolo Chilà <[email protected]>
SonarQube Quality Gate |
Force merging, test failure is unrelated. #3657 |
* Update Windows OnFailureDelayDuration to 15s. This matches the value that endpoint uses and helps mitigate bugs where agent unexpectedly restarts during a system shutdown. * Add changelog. * Update changelog/fragments/1698259940-Increase-wait-period-between-service-restarts-on-failure-to-15s-on-Windows.yaml Co-authored-by: Paolo Chilà <[email protected]> --------- Co-authored-by: Paolo Chilà <[email protected]> (cherry picked from commit 910d17b)
* Update Windows OnFailureDelayDuration to 15s. This matches the value that endpoint uses and helps mitigate bugs where agent unexpectedly restarts during a system shutdown. * Add changelog. * Update changelog/fragments/1698259940-Increase-wait-period-between-service-restarts-on-failure-to-15s-on-Windows.yaml Co-authored-by: Paolo Chilà <[email protected]> --------- Co-authored-by: Paolo Chilà <[email protected]> (cherry picked from commit 910d17b) Co-authored-by: Craig MacKenzie <[email protected]>
Increase the Windows
OnFailureDelayDuration
delay to 15s. This is the delay before the service is restarted when it exits unexpectedly. This is the same value used by endpoint-security by default.Note that this change only applies to new agent installations. We would need to add code to migrate existing agent installations to the new value.
It was originally suggested that we increase this to 30s+ in #3307 (comment) but I am confident the root cause for that problem was addressed by elastic/elastic-agent-libs#155.
Regardless I don't think our current default for this value was chosen for any particular reason and we can at least have agent behave consistently with endpoint. Speaking with @bjmcnic a 15s delay would also mitigate the original problem in all but the most extreme cases even if it were to still go unfixed.