-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful interrupts #119
Graceful interrupts #119
Conversation
The "spawn" method is already the default on Windows and will be the default on MacOS starting with Python 3.8. Switching to this method required a few tweaks since SafeQueue objects can't be pickled. There is still a lingering problem: since signal handlers are not inherited by the child, there is a gap between when the process starts and when the signal handler is installed where a SIGINT can cause an exception instead of a graceful stop.
There is a gap between when a job starts and when it installs the signal handler for SIGINT. It is actually quite common for Cozy to stop a job during the gap; the it happens while running the unit tests, for instance. The gap is an unfortunate necessity; Python does not offer a way to atomically start a new process with a given signal handler pre-installed without using the now-discouraged "fork" multiprocessing method. The gap is closed awkwardly: when a job is requested to stop, it only sends SIGINT if it has evidence that the child has already set itself up for receiving that signal. In the interim period it will only set the stop flag. The only problem with this solution is that any Z3 calls that happen before setting up the signal handler will not be interrupted---but fortunately, today, there aren't any.
This has a few subtle advantages: - The print can be captured by the test harness, so it doesn't muddy the test output. - The print is serialized with other main-thread prints, so there can't be any weird print-interleavings.
The solver will interrupt itself when the stop callback returns true, as the synthesis core procedures do today. The locations where the callback is checked were determined empirically; we could always add more checks later.
GitHub isn't showing the Travis build status for some reason, but something in this change isn't working there. I'm going to set up a Linux environment to see if I can reproduce the failure. |
I'm able to reproduce the failure, but it happens on master as well as this branch. There's a segfault happening somewhere;
My current guess is that the newest version of one of Cozy's dependencies is somehow broken. Hopefully adjusting |
The segfaults in Travis are unrelated to this change. I have opened #120 to track progress fixing the segfaults, and I'm putting this on hold until that's done. |
It turns out there were segfaults happening on both |
Previously the cost model would not always respond to stop requests promptly.
Z3 reports "unknown" instead of some more helpful result when it is interrupted. With this change, Cozy checks if it was interrupted during the Z3 call BEFORE it checks Z3's response. Cozy can then throw a more appropriate StopException if the unknown result was due to an interruption.
Thanks to @izgzhen for the suggestion.
(@izgzhen I will merge this tomorrow unless you have additional feedback.) |
It looks great to me! |
Cross-process unique names have been broken since #119.
Fixes #110.
This change is a bit more invasive than I was hoping for, but it definitely improves responsiveness. As an added benefit, Cozy performs a graceful stop on Ctrl+C instead of discarding its progress.