fork() in fdserver is problematic #563

JannePeltonen · 2018-04-16T14:31:23Z

platform/linux-generic/odp_fdserver.c forks a child process to manage shared memory. This is done to make shared memory work even if ODP threads are actually OS processes. The fork happens inside odp_init_global() and is not followed by exec.

Doing fork in a multithreaded program is somewhat problematic (see fork(2) and http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them). In particular, the child process should call only async-signal-safe functions until exec. The fdserver implementation is not adhering to this rule and can (at least theoretically) e.g. get stuck waiting indefinitely a mutex that was held by another thread at fork time and will never be freed in the child.

One possible fix would be to add such a restriction to ODP API that the calling process must be single threaded when it calls odp_init_global(). Another would be to have fdserver exec the actual server after the fork.

Bill-Fischofer-Linaro · 2018-04-16T14:49:11Z

Thanks for capturing this issue. It sounds like exec would be the preferred (safest) way to go, but would that require that fdserver be built as a separate module? That would clearly have implications for ODP builds/distribution that would need to be worked out.

JannePeltonen · 2018-04-16T15:20:02Z

Yes, the exec approach would seem to require a separate fdserver executable somewhere (since ODP is a library, it cannot e.g. execve("/proc/self/exe", ...)).

Yet another possibility might be to not fork but spawn a thread when ODP threads are OS threads and not OS processes (but how would ODP know?), although in that case I suppose such a separate thread would not even be needed.

Bill-Fischofer-Linaro · 2018-05-11T14:24:35Z

@JannePeltonen Is this still an issue now that fdserver no longer interferes with signal handling by the application?

JannePeltonen · 2018-05-14T09:33:47Z

Yes, it is still a (potential) issue. It is not related to the signal handling issue but about how at fork() all the other threads magically disappear (from child process point-of-view), regardless of their state.

For example, if another thread (a worker thread or some application thread) is executing a critical section and holds a lock inside some library (like C-library), the lock will never be freed. If the child process attempts to use the library and acquire the lock, it will essentially deadlock.

I have not observed this problem in practice, but just by looking at the code.

joseppc · 2018-05-14T12:08:37Z

I took the fdserver code and made it an executable so it could be execve() in odp. It is in its own repo but it could be plugged back into ODP without much effort...

Bill-Fischofer-Linaro · 2018-07-04T21:14:35Z

@JannePeltonen Should this issue be closed for now or is it worthwhile to keep it open?

JannePeltonen · 2018-07-06T13:12:11Z

@Bill-Fischofer-Linaro I leave it to others whether this issue should be kept open or not.

The potential problem still exists, since AFAIK nothing has been done about it in ODP code base. One idea we discussed in one arch meeting was to "fix" the issue for now by documenting that odp_init_global() should be called early when the application is still single-threaded.

Refactor existing FD server by moving the server functionality to a separate process image and launch it with `exec()` after the current forking step has been done. This way typical issues related to forking a multithreaded program are avoided. Fixes OpenDataPlane#563. Signed-off-by: Tuomas Taipale <[email protected]>

joseppc mentioned this issue Jun 21, 2018

[PATCH v1] [RFC] fdserver executable #628

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fork() in fdserver is problematic #563

fork() in fdserver is problematic #563

JannePeltonen commented Apr 16, 2018

Bill-Fischofer-Linaro commented Apr 16, 2018

JannePeltonen commented Apr 16, 2018

Bill-Fischofer-Linaro commented May 11, 2018

JannePeltonen commented May 14, 2018

joseppc commented May 14, 2018

Bill-Fischofer-Linaro commented Jul 4, 2018

JannePeltonen commented Jul 6, 2018

fork() in fdserver is problematic #563

fork() in fdserver is problematic #563

Comments

JannePeltonen commented Apr 16, 2018

Bill-Fischofer-Linaro commented Apr 16, 2018

JannePeltonen commented Apr 16, 2018

Bill-Fischofer-Linaro commented May 11, 2018

JannePeltonen commented May 14, 2018

joseppc commented May 14, 2018

Bill-Fischofer-Linaro commented Jul 4, 2018

JannePeltonen commented Jul 6, 2018