Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Care about children of supervised processes #33

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

tecki
Copy link

@tecki tecki commented May 8, 2017

daemons that forked their own children always were a big problem for supervision, since when the daemon dies, the orphaned children are out of sight of the supervisor. This was especially a problem for classical Unix daemons that forked immediately. This is what fghack is for, but as the name suggests this is only a hack. And it does not address the problem of daemons that completely legally spawn children.

This used to be a problem based on the operating system itself, as unices had no method to prevent that, and apparently POSIX was silent on that problem.

Recently, both Linux and FreeBSD (including DragonFly) have introduced a solution to this problem: subprocess reapers. If a process dies, all its children are adopted by the subprocess reaper. supervise is the ideal candidate to be such a subprocess reaper.

This pull request adds a new flag file orphanage to service directories. It means that if the supervised daemon dies, supervise goes into a orphanage state, meaning it waits until all children have died (I am not aware of a simple way to kill all orphans). Only once that's done, the service is restarted (if so demanded).

This works well in combination with supervise creating a new process group, this way it is possible to send a signal to all children to terminate them.

I consider the new subprocess reaper functionality in Linux and FreeBSD a huge step forward in the field of process supervision, that it should be added to daemontools-encore, even if it cannot be supported on all platforms. On non-supported platforms, everything stays as-is, as the orphanage options is just an add-on.

tecki added 6 commits May 8, 2017 10:19
when the service directory contains a file "orphanage", we adopt all
children of the supervised process when it dies, and wait until they
all have died as well.
every hour the status was written again, reseting the down timer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant