-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that efivarfs is mounted in the container #302
Conversation
Hi @bcrochet. Thanks for your PR. I'm waiting for a containers member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
lib/src/install.rs
Outdated
tracing::debug!("Setting up sys mounts"); | ||
// First of all, does efivars even exist in the host? If not, we are | ||
// not dealing with an EFI system | ||
let _ = rustix::fs::lstat("/proc/1/root/sys/firmware/efi/efivars")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause us to hard fail though if it doesn't exist, which is not what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Forgot to test that case on a BIOS system.
lib/src/install.rs
Outdated
let _ = rustix::fs::lstat("/proc/1/root/sys/firmware/efi/efivars")?; | ||
|
||
// Now, let's find out if it's populated | ||
let od = rustix::fs::open( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far typically here we use std::fs
unless there's a reason to use rustix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
lib/src/install.rs
Outdated
// Now, let's find out if it's populated | ||
let od = rustix::fs::open( | ||
"/proc/1/root/sys/firmware/efi/efivars", | ||
rustix::fs::OFlags::DIRECTORY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, this is missing OFlags::CLOEXEC
which is generally a good idea. While this codebase isn't security sensitive in general, see e.g. https://github.com/opencontainers/runc/releases/tag/v1.1.12 which was a bad CVE in runc
from leaking fds and handling them incorrectly.
lib/src/install.rs
Outdated
let dir_populated = rustix::fs::Dir::read_from(od)?.next().is_some(); | ||
if dir_populated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combining the above I think something just like:
let efivars = "/proc/1/root/sys/firmware/efi/efivars";
if !Path::new(efivars).try_exists()? {
return Ok(());
}
if std::fs::read_dir(efivars)?.next().is_none() {
return Ok(());
}
...rest of code
or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is https://docs.rs/cap-std-ext/latest/cap_std_ext/dirext/trait.CapStdExtDirExt.html#tymethod.open_dir_optional which we use in a few places in this project, and is intended for this use case.
lib/src/install.rs
Outdated
// This looks to be a workaround for a problem elsewhere. Not sure if | ||
// it's in podman, but just attempting to mount this is enough to trigger | ||
// it being populated even when there is an existing mount. | ||
// Given that, we will just try to mount and ignore the result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the rationale behind ignoring the error. Is it erroring out now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will double-check tomorrow. But what I think I was seeing today was that when bootc install to-filesystem --replace=alongside /target
was run... On the first invocation /sys/firmware/efi/efivars was unpopulated, but {/proc/1/root,/target}/sys/firmware/efi/efivars are populated, and the mount command fails, saying "efivarfs already mounted at /target/sys/firmware/efi/efivars".
But once that was run, and failed out, "magically" /sys/firmware/efi/efivars was now populated, and the bootc run would complete without error.
I'm totally game with working on this further to make it more resilient. I also think this is some kind of bug elsewhere. I'm just not sure where to point the finger.
So, my rationale here was that it didn't matter if the mount command itself succeeded, but just that it was run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I did verify that adding -v /sys:/sys:rslave
to our podman run
invocation gets us the efivars mount (as expected) alongside others that we also currently mount manually like selinuxfs
.
and the mount command fails, saying "efivarfs already mounted at /target/sys/firmware/efi/efivars".
That's weird indeed. In a quick test I can't reproduce that with a simple
$ podman run --rm -ti --privileged --pid=host quay.io/fedora/fedora:39
# mount -t efivarfs efivars /sys/firmware/efi/efivars
Probably worth trying to dig into what's different about what we're doing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper direction may be then to just do a doc change for this, while we investigate the origin of the error.
Is it perhaps ARM specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think the other key is that -v /:/target
is used. As efivarfs will be mounted in /target.
And of course I'm now unable to reproduce this scenario. 🙄
I'll update this patch with all of the suggestions and stop ignoring this error, and we can go from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think the other key is that -v /:/target is used. As efivarfs will be mounted in /target.
Ahh yes, tricky. In the case that we have the full target rootfs, we could in our container simply do e.g. mount --rbind /target/sys /sys
or so.
Actually for this "alongside" case it would definitely help there if we still didn't need to pass in the -v /:/target
even and just got the root filesystem by jumping out of the mount namespace.
/ok-to-test |
7e12123
to
cf1ff27
Compare
let inspect = crate::mount::inspect_filesystem(path); | ||
if inspect.is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK as is, but is_ok()
is a pattern to try to avoid as a general rule because it "swallows" all forms of errors (if e.g. the findmnt
binary failed to be executed, not just if the target isn't a mountpoint).
Unfortunately detecting if something is a mountpoint is surprisingly tricky (we use a helper ostree_ext::mountutil::is_mountpoint
in other places here). It looks like we could extend our findmnt
wrapper to support passing -T/--target
, which would always succeed (in general), and then we can check if the filesystem type is efivarfs
.
lib/src/install.rs
Outdated
} | ||
|
||
// This means the host has this mounted, so we should mount it too | ||
tracing::debug!("mounting efivarfs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor nit, I doubt we need this debug invocation because Task::new_and_run
will already show something like this with the tracing::debug!("exec: {cmd:?}");
#[context("Ensuring sys mounts")] | ||
pub(crate) fn setup_sys_mounts() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense as a name, though let's add some commentary here, something like:
/// By default, podman/docker etc. when passed `--privileged` mount `/sys` as read-only, but non-recursively. We selectively grab sub-filesystems that we need.
I think your naming is right though because it'd make sense to also handle selinuxfs
exactly this same way. (Don't need to do it in this PR though)
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK as is, we can do other fixes as followups. Does this solve the problem in your testing?
So far, yes. I'll do the cleanups suggested. |
Especially on ARM, which utilizes UEFI for booting in most cases, it is important that the /sys/firmware/efi/efivars be mounted and populated, otherwise bootc will fail to complete a to-filesystem installation. This patch attempts a mount as long as the hosts efivars directory has an entry, signifying the system is at least capable of UEFI. Note that it is sufficient to just attempt to mount efivars. If it's already mounted elsewhere, it triggers the mount to be made at the /sys location. Fixes containers#291 Signed-off-by: Brad P. Crochet <[email protected]>
As a follow-on to containers#302, we want to also mount the selinuxfs special filesystem if the host also has that filesystem mounted. Related containers#303 Signed-off-by: Brad P. Crochet <[email protected]>
As a follow-on to containers#302, we want to also mount the selinuxfs special filesystem if the host also has that filesystem mounted. Related containers#303 Signed-off-by: Brad P. Crochet <[email protected]>
As a follow-on to containers#302, we want to also mount the selinuxfs special filesystem if the host also has that filesystem mounted. Related containers#303 Signed-off-by: Brad P. Crochet <[email protected]>
Especially on ARM, which utilizes UEFI for booting in most cases, it is important that the /sys/firmware/efi/efivars be mounted and populated, otherwise bootc will fail to complete a to-filesystem installation.
This patch attempts a mount as long as the hosts efivars directory has an entry, signifying the system is at least capable of UEFI.
Note that it is sufficient to just attempt to mount efivars. If it's already mounted elsewhere, it triggers the mount to be made at the /sys location.
Fixes #291