-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greenboot does not fail when first error is found #143
Comments
I think the microshift scripts are part of microshift, the bug would need to be reported there as those are the scripts that are taking the time |
I opened a ticket wit hMicroShift and discussed this with Gregory Giguashvili previously. He felt this was a Greenboot feature enhancement. 'Looking at https://github.com/fedora-iot/greenboot/blob/main/usr/libexec/greenboot/greenboot#L69, the greenboot check continues the loop until all scripts are run and only checks for the error condition at https://github.com/fedora-iot/greenboot/blob/main/usr/libexec/greenboot/greenboot#L78.` |
Can you provide a link to the microshift ticket, the details provided to date don't provide that context. Also links to where the mentioned scripts can be found. |
Oh yes. i was not sure if you could view the ticket. Apologies |
Seems like it should be a RFE to have an option to either 1) run all tests 2) fail at the first failure. To quote the full comment: ` Looking at https://github.com/fedora-iot/greenboot/blob/main/usr/libexec/greenboot/greenboot#L69, the greenboot check continues the loop until all scripts are run and only checks for the error condition at https://github.com/fedora-iot/greenboot/blob/main/usr/libexec/greenboot/greenboot#L78. May I suggest closing this JIRA issue and opening an upstream feature request at https://github.com/fedora-iot/greenboot/issues? |
It also looks like microshift is randomly copying/forking things, it would be useful if those could be actually sent back upstream rather than forking as it also makes it hard to know if the bug is upstream or in your fork :) |
@nullr0ute Do you need me to do anything to start an RFE ? |
@nullr0ute , let me comment on the issues you're raising.
The comment you're referring to in the JIRA ticket was our best guess on the current greenboot implementation. From the experience we have, I cannot think of a use-case when I would want to continue running greenboot scripts following a failure. It just delays the inevitable.
Note that the commit you're referring to is not a copy / fork of any greenboot upstream functionality. It's a bug fix in MicroShift internal scripts. These scripts are implementing MicroShift-specific health-check functionality and they cannot be used upstream in the generic greenboot code. |
This issue is observed in MicroShift project. When multiple start scripts are on the system, and an error is encountered in the first script, Greenboot does not immediately fail. Using default settings, Greenboot checks all of them before rebooting (in 4.16, this is 3 scripts 40_microshift_running_check.sh, 41_microshift_running_check_multus.sh, 50_microshift_running_check_olm.sh). This can add delay to the system rolling back to a known good state. This might be intended design, but it can add time before a rollback can occur.
On the first boot, each script can take up to 5 minutes to check. That is ~15 minutes. On the second boot, this increases to 10 minutes per script. That is ~30 minutes. On the third boot, each script takes up to 15 minutes.
Can this be optimized in some way ?
The text was updated successfully, but these errors were encountered: