-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greenboot should notify users of abrupt power failure #103
Comments
How would you suggest it detects this? |
I would think of parsing though journald and check for shutdown.target/reboot.target or some related target to validate if a system has been shutdown gracefully. |
We've had issues with trying to use the journal in our "control loops"; a very pertinent one related to this is that rhel8's See e.g. openshift/os#1271 Today rpm-ostree actually also does exactly this to detect if ostree-finalize-staged failed and there's a whole That said recently I did ostreedev/ostree#2589 which is a bit related here. Arguably indeed we could extend things with a similar model where we persist "attempt to reboot with pending changes to apply" in a persistent non-journal place. |
We also need to support systems that don't use a persistent journal. So in general if it's critical, then it can't be in the journal and needs to be external to it. You're arguing for something informative which could be in the journal, but it still gets tricky for the above reasons. |
Here's a strawman proposal; what if we just merged the greenboot code as is into github.com/coreos/rpm-ostree ? We'd make it a new subpackage; the RPM-level transition could either be that we start generating subpackages literally named the same things (possible AFAIK) or we make a new |
For
I know we did something for rhel9/8 to enable this: osbuild/osbuild-composer#3118, I guess we can do that for fedora too.
I want to understand how integrating greenboot in rpm-ostree will solve the above problem. |
@say-paul I think Colin refers to operating system w/o journal altogether, not enabling persistency there..
having worked on MCO and the journald thing, I agree it's not ideal and we can't use it, we'd definitely need something more robust... @cgwalters not sure maybe I've missed it, after merging it in rpm-ostree, would the plan be to better integrate it with rpm-ostree? |
My thoughts on this primarily to start are at the very practical level:
Beyond the "infrastructure" level, I really want to integrate greenboot state into For greenboot, I think the basic integration here would be showing when the current boot was the target of an automated rollback - and surfacing that in a consistent way via the same |
that would indeed be ideal, and I think we had this discussion elsewhere too, maybe in the future we could integrate other greenboot's functionality into rpm-ostree too (boot stuff mainly I think to remember). Infrastructure wise, yeah, our integration tests aren't wired here, osbuild-composer "drives" them and our QE team too, that's not ideal... We do not indeed watch rpm-ostree closely. I guess, I'm not against this, at all, I think it would be beneficial to keep advancing greenboot. There are still things to do correctly (somebody changes something in /etc breaking a greenboot check and no reboot happens, then a working upgrade come but the greenboot check fails because of an unrelated-to-the-upgrade issue). Let's see what others think too. |
We started the integration conversation in the context of ostree - ostreedev/ostree#2725 |
Ultimately having this in ostree does I think make the most sense, but at a practical level today the code is invoking rpm-ostree, and I was thinking of this as the "no code changes" move. We can still lower into ostree later. |
Greenboot should notify user(MOTD) that there is an abrupt boot cycle detected in-case of power loss in device or force killing of a VM during the next reboot.
As there can be things/service that dependent on shutdown targets didn't get executed correctly which may cause issue in the next boot.
example:
rpm-ostree update
the staged update gets lost once there is sudden power failure.The text was updated successfully, but these errors were encountered: