Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

survive bereaving #141

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

survive bereaving #141

wants to merge 1 commit into from

Conversation

fkluknav
Copy link

Hi,

sometimes it is useful to keep init alive when its direct child dies. For example when running a double-forking daemon or SysV init scripts.

This pull request adds a new option '--survive-bereaving' or '-b' that keeps dumb-init running. It checks /proc/ after SIGCHLD and quits only when there are no processes left.

Documentation is not updated, I do not feel confident enough in English.

Any comments are welcome.

Have a nice day,

Fero

@chriskuehl
Copy link
Contributor

Thanks for sending in this PR! This request has come up a couple of times, notably in #80, so I think there are definitely legitimate use-cases for it.

Sort of continuing my comment in that thread, it'd be cool if we could find a way to implement this that works even when dumb-init isn't PID 1, since it makes testing a lot easier (and also we use dumb-init in some non-PID-1 places). In that case, we'd need a way to list all child processes of dumb-init (rather than all processes on the system), and I don't know of a portable way to do that.

On Linux, we could read /proc/<pid>/tasks/<pid>/children, but to my knowledge this interface isn't available outside of Linux and the only other way is to iterate all processes on the system and read their parent PID (https://lwn.net/Articles/475688/ -- this also seems to be what pstree does).

I guess maybe we could also support this feature on Linux only? Do you know if there's a straightforward way to modify this change so that it only considers child PIDs when counting?

@fkluknav
Copy link
Author

I am not completely sure what is the desired behavior as non-pid1. Wait for all children, grandchildren... to exit and then exit dumb-init?
My use-case is the opposite, I want init to keep running as long as there is any process in the namespace, including processes that are not descendants of init but entered the namespace from elsewhere. To achieve that, init must wake at regular intervals (I do not know of a better way) - init will not receive sigchld for those processes.

What kind of portability are you trying to achieve? POSIX? In that case, the answer is 'ps' utility.

@Mathiasdm
Copy link

Can I help push this pull request forward somehow? I'd love to use this functionality (on Linux).

@chriskuehl chriskuehl self-requested a review September 22, 2017 18:02
@chriskuehl
Copy link
Contributor

@Mathiasdm thanks for the bump! I think this is a solid PR, I've got a couple minor comments I'll leave but overall I think we can merge this pretty soon.

Copy link
Contributor

@chriskuehl chriskuehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for sending this PR! We'll also want to update the documentation and add some tests -- I can help with those things too. Let me know your thoughts here, I'm also happy to help make some of these changes :)

" -c, --single-child Run in single-child mode.\n"
" In this mode, signals are only proxied to the\n"
" direct child and not any of its descendants.\n"
" -b, --survive-bereaving Do not quit when the direct child dies.\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be helpful to add another line to clarify further when it actually does quit. Maybe a line like "In this mode, dumb-init will quit when there are no other processes running on the host. This mode requires dumb-init to be running as PID 1."?

@@ -199,6 +257,9 @@ char **parse_command(int argc, char *argv[]) {
case 'r':
parse_rewrite_signum(optarg);
break;
case 'b':
survive_bereaving = 1;
break;
default:
exit(1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since survive_bereaving only works properly when dumb-init is running as PID 1, I think it'd be a good idea to add a check like:

if (survive_bereaving && getpid() != 1) {
    PRINTERR("--survive-bereaving is only valid when dumb-init is running as PID 1.\n");
    exit(1);
}

I think that would reduce confusion for anybody who tries to use the feature without being PID1.

@@ -68,6 +71,43 @@ void forward_signal(int signum) {
}

/*
* Read /proc and see if there are processes except init(PIDs)
*/
signed int process_count() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't exactly returning the total process count, maybe it'd be better to rename this function to is_only_process_running?

signed int pc = process_count();
DEBUG("Process count: %d\n", pc);
if (pc <= 1) {
DEBUG("No process left, exitting.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo exitting -> exiting

{
while ((ep = readdir (dp)) != NULL) {
nonnumber = 0;
for (int i = 0; ep->d_name[i] != 0; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it'd be slightly more clear to write ep->d_name[i] != '\0'


dp = opendir ("/proc");
if (dp != NULL)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting...

@kpengboy
Copy link
Contributor

Does PR_SET_CHILD_SUBREAPER work for the non PID-1 case?

Alternatively, if you're trying to monitor all processes in the namespace, you'll have to monitor /proc at regular intervals (not just when SIGCHLD is received) or write some hack with inotify...

@nvx
Copy link

nvx commented Jun 1, 2020

I just ran into an issue that this would have solved nicely. Is there anything needed before this can be merged or?

@odoucet
Copy link

odoucet commented Mar 26, 2021

current MR failed to compile with :

dumb-init.c: In function 'handle_signal':
dumb-init.c:170:21: error: 'exit_status' may be used uninitialized in this function [-Werror=maybe-uninitialized]
                 exit(exit_status);
                     ^
cc1: all warnings being treated as errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants