Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup,native: ensure we start the cleaner before creating cgroup #93

Merged
merged 1 commit into from
Feb 6, 2024

Conversation

Snaipe
Copy link
Member

@Snaipe Snaipe commented Feb 5, 2024

Some callers of bst would send it a SIGKILL as soon as the underlying operation would get canceled. The problem is that it was possible to race the cgroup initialization code of the native driver, such that the SIGKILL would be received after mkdirat of the cgroup directory, but before the cgroup cleaner has any chance at starting.

This commit fixes this issue by reordering the operations. The cgroup cleaner is now started before mkdirat, and waits on a blocked pipe read until the parent process (which is the outer helper) dies. This ensures that the cleaner is started first and foremost, and that it waits until the cgroup has been initialized by the helper.

Some callers of bst would send it a SIGKILL as soon as the underlying
operation would get canceled. The problem is that it was possible to
race the cgroup initialization code of the native driver, such that the
SIGKILL would be received after mkdirat of the cgroup directory, but
before the cgroup cleaner has any chance at starting.

This commit fixes this issue by reordering the operations. The cgroup
cleaner is now started before mkdirat, and waits on a blocked pipe read
until the parent process (which is the outer helper) dies. This ensures
that the cleaner is started first and foremost, and that it waits until
the cgroup has been initialized by the helper.
switch (read(lock, &ok, sizeof (ok))) {
case -1:
warn("run_cgroup_child: read on lock");
goto lastDitchEffort;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really a preference for goto lastDistchEffort over return fallback_delete_cgroup ? I suppose the former keeps the scope quite a bit tighter since lastDitschEffort is only accessable from inside the function, but it feels anachronistic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it usually ends up being cleaner because the alternative is exfiltrating a bunch of local variables back to the parent, though in this case for one of the two goto labels, the caller has the parentfd.

It's probably one of the two legitimate uses of goto in C code. Thankfully, we have defer in Go 😛

@Snaipe Snaipe merged commit 3847b40 into aristanetworks:main Feb 6, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants