-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zdtm: Distinguish between fail and crash of dump #2376
zdtm: Distinguish between fail and crash of dump #2376
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## criu-dev #2376 +/- ##
============================================
- Coverage 70.73% 70.57% -0.16%
============================================
Files 136 135 -1
Lines 32940 33449 +509
============================================
+ Hits 23299 23606 +307
- Misses 9641 9843 +202 ☔ View full report in Codecov by Sentry. |
62ae140
to
4bfaf6e
Compare
It looks like we need to add a new fault injection there. See f1714cc as an example. This way we would be able to add a test for it to test/jenkins/criu-fault.sh with |
Adds a exit_signal static method to criu_cli, criu_config and criu_rpc used to detect a crash. Fixes: checkpoint-restore#350 Signed-off-by: Bhavik Sachdev <[email protected]>
4bfaf6e
to
2c9e91c
Compare
Hello @Snorch, |
Looks correct. But from zdtm.py point of view it may be not so simple, see how __criu_act retries action without fault, we likely don't want our fault to be retry-able by zdtm, so maybe just make it's id >=128 ?
Yeh, it's a good question. Let's just use one random test with |
1400985
to
2590899
Compare
I don't think giving the fault injection an id >= 128 is what we want.
I have added |
I may misunderstand something, correct me if I'm wrong, but isn't a test fail exactly the behavior we expect in case of criu crash? In other words: if criu fails to dump/restore
|
2590899
to
c787b51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Sorry, the misunderstanding was on my part. You are correct; the tests previously passed as they were re-run without the fault injection. Thank you for all your help @Snorch :) |
Hm, test still does not pass with This seem to do the right trick:
Also maybe someone else has better ideas? |
Signed-off-by: Bhavik Sachdev <[email protected]>
c787b51
to
d471052
Compare
A friendly reminder that this PR had no activity for 30 days. |
Can we remove the "stale-pr" label? |
Removed. |
This PR distinguishes
criu dump
crash fromcriu dump
fail in thezdtm.py
script. Now,zdtm
reportscriu dump
crash as a test failure.Very similiar to the work done in this PR (#2140)
But, to detect a rpc crash we see that if the response from the RPC is invalid, then we assume
the RPC crashed.
For detecting the crash for the CLI we can check the
Popen.returncode
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.returncode
To simulate the crash, I added a call to
abort()
beforeline 2104 criu/cr-dump.c
Fixes: #350