-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BinderTracingTest.ResolutionFlow times out #104670
Comments
Tagging subscribers to this area: @dotnet/area-infrastructure-libraries |
cc @davmason |
This seems to be a continuation of #97735 and #94390. Given that the BinderTracingTests are already skipped for Jitstress and GCStress, there is probably another culprit for these tests timing out. It seems flakey given the low hit count, and that previous counterpart #97735 at one point had 0 hit counts in 30 days. I haven't been able to repro the timeout locally, and given how flakey it is on CI, I'm not sure if I'd be able to reliably repro in CI. From this build instance it seems like this is causing the hang
based off of
From looking at what BinderTracingTests does, I'm not quite sure what is causing the separate subprocess to hang. Given how this test is still hanging even without GCStress/Jitstress, I would've expected for the test to hit the BinderEventListener's 30s timeout at runtime/src/tests/Loader/binding/tracing/BinderEventListener.cs Lines 177 to 178 in ad25cd0
@elinor-fung / @davmason any other ideas on what might be causing the hang? |
762118 was from a PR that caused a deadlock, so it isn't the same cause as the first singluar hit in 735589 |
Isn't that a jitstress pipeline? Why did the test get run there if it was marked JitOptimizationSensitive=true in #102842? In any case, the latest failure console log shows a hang presumable when waiting for this test
FindInLoadContext_DefaultALC_IncompatibleVersion finished + the stack trace has system.diagnostics.process.dll!System.Diagnostics.Process.WaitForExitCore ).
I guess the console.writelines from a subprocess don't actually get written immediately, given that we don't see either of these runtime/src/tests/Loader/binding/tracing/BinderTracingTest.cs Lines 199 to 202 in ab03e0f
Also the subprocess dump doesn't get generated even though we are expecting to create a dump for all processes related to corerun. @hoyosjs any ideas on what to tweak to be able to capture the dump for the hanging subprocess in this case? |
The jitstress pipelines runs tests under many different configurations. This particular configuration does not set any of the "jitstress" environment variables, it only sets the following:
|
From the latest failure's console log, the test spins up a subprocess with PID 920
and that subprocess appears to complete
Moreover, at the end of the console log, there is
Could this be a bug with Process.WaitForExit() where the subprocess fails to signal to the parent process that it finished? In any case, its not apparent to me that this is a tracing issue. Its odd that our logic to collect dumps from all childprocesses isn't generating a dump for the PID 920 that the test is supposedly stuck on. Until we collect a crashdump from the child process that the test is stuck on, it'll be hard to figure out what is causing the test to hang. |
Tagging subscribers to this area: @dotnet/area-system-diagnostics-process |
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=735589
Build error leg or test failing: Loader/binding/tracing/BinderTracingTest.ResolutionFlow/BinderTracingTest.ResolutionFlow.cmd
Pull request: #104603
Error Message
Fill the error message using step by step known issues guidance.
Known issue validation
Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=735589
Error message validated:
[BinderTracingTest.ResolutionFlow.* Timed Out
]Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 7/10/2024 11:13:33 AM UTC
Report
Summary
The text was updated successfully, but these errors were encountered: