You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug description
Whew, that's a mouthful! In this one very specific set of circumstances—but they can easily arise by accident—we end up with a Debugger open even though we asked to just run the test (with errors leading to a failure).
To Reproduce
Steps to reproduce the behavior:
Save the following method (it doesn't actually matter what test class it's on):
TestCaseTest>>testBugExceptionInBackgroundEnsureHangs
| backgroundProc |
backgroundProc := [ [ ] ensure: [ self error: 'foo' ] ]
forkAt: Processor userBackgroundPriority.
backgroundProc name: 'Background process'.
"Wait for the error to actually occur in the background process, at which point the test framework will suspend it"
(Delay forMilliseconds: 100) wait
Run the test—not debug, so either run all tests for the class, or evaluate (TestCaseTest selector: #testBugExceptionInBackgroundEnsureHangs) run in a workspace.
Observe that a debugger appears on "Error: foo". If this is part of running a large batch of tests, the run stops, waiting for the debugger to close—so in CI this would hang the build.
It seems that when this debugger is closed, the background process does not always actually terminate, and also references to lots of processes remain around when they really shouldn't? This bears further investigation and is not the primary problem.
Expected behavior
No debugger, the test reports failure, and the background process is fully terminated.
Additional info
The sequence of events, as far as I can tell, goes something like:
The background process raises the error. This goes initially unhandled, then the UnhandledException is caught by the handler installed in ProcessMonitorTestService>>handleNewProcess: using Process>>on:do:. This goes through ProcessMonitorTestService>>handleException: and ultimately double-dispatches its way to handleUnhandledException:, which suspends the background process and adds it to a list of failed processes.
The test completes normally, including tearDown, and reaches ProcessMonitorTestService>>ensureNoBackgroundFailures (by way of TestExecutionEnvironment>>handleCompletedTest in #runTestCaseUnderWatchdog:).
This signals a TestFailedByForkedProcess, since there is a background failure. This is first caught by the handler in runTestCaseUnderWatchdog:, which just makes note of it and passes, so that's not interesting.
Then it is caught by the handler in TestResult>>runCase:, which makes note of the test having failed and (implicitly) ex returns, curtailing the whole test run.
Unwind hits the ensure: in TestExecutionEnvironment>>runTestCase:, which ultimately sends ProcessMonitorTestService>>cleanUpAfterTest -> terminateRunningProcesses, thus sends terminate to the background process that is suspended on the error catch.
But since the error was raised in an ensure: block, terminate tries to finish executing that ensure: block...and that ends up resuming in handleBackgroundException: and doing an ex pass, leading to a debugger, and blocking the main test process waiting for the termination to actually finish.
You can see the whole thing reproduced in a single workspace as follows—it is only the ensure: in the background process itself that matters, not the ones in the main process:
It seems like the pass in handleBackgroundException: needs to be conditional, so the exception gets returned instead when terminating the process. And we have something that I think makes sense to use for this—shouldPassBackgroundFailures. Conditionalizing the pass does make the problem go away, so I'll toss out a PR for that.
Version information: Presumably any OS, any Pharo (tested in 11/Win and 13/macOS).
The text was updated successfully, but these errors were encountered:
I should add, the example is of course highly contrived for the sake of simplicity. The real scenario I encountered was with application code that uses a background process whose body looks like, roughly, [[["we have items to process right now"] whileTrue: ["process one item, with error handling"]] ensure: ["we are done with a batch of items"]] repeat. When the code was originally written, the "done with a batch of items" code was very simple and couldn't throw an error, but it has gotten more complex and recently did throw an error, leading to the situation above. This pointed out that the error handling needed to be expanded to cover that, at which point the background process no longer throws an unhandled error and the problem goes away, but it should have gracefully failed the test rather than hanging a CI job.
Bug description
Whew, that's a mouthful! In this one very specific set of circumstances—but they can easily arise by accident—we end up with a Debugger open even though we asked to just run the test (with errors leading to a failure).
To Reproduce
Steps to reproduce the behavior:
(TestCaseTest selector: #testBugExceptionInBackgroundEnsureHangs) run
in a workspace.Expected behavior
No debugger, the test reports failure, and the background process is fully terminated.
Additional info
The sequence of events, as far as I can tell, goes something like:
UnhandledException
is caught by the handler installed inProcessMonitorTestService>>handleNewProcess:
usingProcess>>on:do:
. This goes throughProcessMonitorTestService>>handleException:
and ultimately double-dispatches its way tohandleUnhandledException:
, whichsuspend
s the background process and adds it to a list of failed processes.tearDown
, and reachesProcessMonitorTestService>>ensureNoBackgroundFailures
(by way ofTestExecutionEnvironment>>handleCompletedTest
in#runTestCaseUnderWatchdog:
).TestFailedByForkedProcess
, since there is a background failure. This is first caught by the handler inrunTestCaseUnderWatchdog:
, which just makes note of it and passes, so that's not interesting.TestResult>>runCase:
, which makes note of the test having failed and (implicitly)ex return
s, curtailing the whole test run.ensure:
inTestExecutionEnvironment>>runTestCase:
, which ultimately sendsProcessMonitorTestService>>cleanUpAfterTest
->terminateRunningProcesses
, thus sendsterminate
to the background process that is suspended on the error catch.ensure:
block,terminate
tries to finish executing thatensure:
block...and that ends up resuming inhandleBackgroundException:
and doing anex pass
, leading to a debugger, and blocking the main test process waiting for the termination to actually finish.You can see the whole thing reproduced in a single workspace as follows—it is only the
ensure:
in the background process itself that matters, not the ones in the main process:Expected development cost
It seems like the
pass
inhandleBackgroundException:
needs to be conditional, so the exception getsreturn
ed instead when terminating the process. And we have something that I think makes sense to use for this—shouldPassBackgroundFailures
. Conditionalizing thepass
does make the problem go away, so I'll toss out a PR for that.Version information: Presumably any OS, any Pharo (tested in 11/Win and 13/macOS).
The text was updated successfully, but these errors were encountered: