-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Linux 6.15 compat #17229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Linux 6.15 compat #17229
Conversation
The intent is that the filesystem may have a reference to an "old" version of the new directory, eg if it was keeping it alive because a remote NFS client still had it open. We don't need anything like that, so this really just changes things so we return error codes encoded in pointers. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
Renamed in 6.2, and the compat wrapper removed in 6.15. No signature or functional change apart from that, so a very minimal update for us. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
Linux 6.15 removes has_capability(), as nothing uses it internally. Unfortunately, there are no readily available alternatives available to us, forcing use of a different approach. Instead, we borrow the credential from the target process, swap it into the current process, then do a standard capability check for the current process. We revert the swap, release the cred, and return the chec result. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]>
return (0); | ||
const struct cred *c = get_task_cred(proc); | ||
const struct cred *old = override_creds(c); | ||
bool hascap = capable(CAP_SYS_ADMIN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since has_capability()
was just:
bool has_capability(struct task_struct *t, int cap)
{
return has_ns_capability(t, &init_user_ns, cap);
}
EXPORT_SYMBOL(has_capability);
before, can't we just use has_ns_capability()
in the same way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init_user_ns
is GPL'd. Could we use cr->user_ns
instead?
int
secpolicy_zfs_proc(const cred_t *cr, proc_t *proc)
{
if (!has_ns_capability(proc, cr->user_ns, CAP_SYS_ADMIN))
return (EACCES);
return (0);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_ns_capability
is not exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(incidentally, if you did ever need init_user_ns
, I'm pretty sure you can always get it through init_task.init_cred.user_ns
).
Died overnight during the ZTS run with a hard panic:
Last log says it was in cleanup for |
Second run ran to completion, all successful. I'll do a few, see if I can run into it again. |
Motivation and Context
6.15-rc1 released, so off we go again.
Description
First two changes are benign: minor signature change for
iops->mkdir
, and a rename fordel_timer_sync
. Nothing to see here.The last one is more complicated, and I would like a subject-matter expert or two to have a look, because I'm not that and this is educated guesswork.
For background, since I had to go figure it out myself:
secpolicy_zfs_proc()
is used to ensure that a given process (or credential from it) is allowed to perform some operation. It's used by various async admin tasks (eg taking a snapshot), where the calling process (egzfs snapshot
in userspace) is not the one that actually performs the work (some kernel thread), and of particular interest, I think, is the use in channel programs, where we need to consider the permissions of thezfs program
process as different Lua functions are called.On Linux, most of the capability-checking functions check permissions for the current task, not some arbitrary task. We have used
has_capability()
, which can take a credential from a target process; exactly what we want. However, this function has been removed in 6.15, and there's no easy equivalent we can just drop in:has_capabaility_noaudit()
would drop in, but as it says, doesn't post an audit event, which isn't something I just want to get rid of unless we have no choicehas_ns_capability()
,security_capable()
and other low-level functions aren't exported to modulesfile_ns_capable()
is exported, but it's a quite awful to cook astruct file
with only astruct cred
insideSo, I've taken a slightly different approach. There's a set of APIs for swapping a
struct cred
into the current task, and reverting it. I'm now using this to get the cred from the target task, swap it in, do the check, swap it all back out and return. This appears to at least not break anything in the test suite, but I don't know to what extent it's actually exercised, so I'd like someone who knows this subsystem or has the ability to test properly to do so and report back.I don't love this, in part because of the warning in the kernel headers for
get_task_cred()
that "the caller must make sure task doesn't go away". However, it seems that if this is a problem, then it's already a problem, in that the stashed task we pass tosecpolicy_zfs_proc()
could have disappeared since the original call from userspace. I'm assuming that's not already happening for now.Incidentally, if this does pan out, it might be the way to change the API to only need the
cred_t
, like on FreeBSD, by callingprepare_kernel_cred()
wherever we would callCRED()
now. This creates astruct cred
explicitly designed to allow a kernel thread to do something on behalf of a userspace program. It does change thing a little, as we need a release function (the returned cred is heap-allocated), but it might allow us to not have to pass the task around with the cred. I'm not going to do this (for now) as I really don't know enough about it all, but there's a small project if someone wants it.Tiny spanner:
get_task_cred()
exists but was not exported in upstream 4.19 (though I'd be surprised if it were not backported to an EL8 kernel). Regardless, it is trivial to implement a wrapper, or to just fall back tohas_capability()
there. The reason I haven't kepthas_capability()
for <6.15 is because I would rather not have two entirely different methods on "active" kernel releases unless there's a real need. Too many places for bugs and quirks to hide otherwise.How Has This Been Tested?
Compiled checked on all LTS kernel series >= 4.19, and all 6.x kernel series.
Sanity run (create pool, fio, export, import, scrub) run on >= 5.10, completed successfully. (4.19 skipped for now, as above, and something weird booting 5.5 on my test VMs, unrelated)
ZTS now running on 6.15-rc1. I'll report back tomorrow!
Types of changes
Checklist:
Signed-off-by
.