-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test failed with error after triggering a kernel panic #3284
Comments
AFAICT, that is the expected outcome: test (and guest) did not reboot, they crashed, the underlying SSH session was abruptly terminated and all tmt got out of it was an exit code 255. From tmt's point of view, this is a mere crash, and it does not know what else to do than report an error and move on.
Yes, because then the process runs under tmt control, tmt is aware that a reboot is expected: tmt gets your "reboot command", Frankly said, you decided to kill the guest without telling tmt about it, so the error outcome is perfectly valid :) I'm not sure we can ever resolve this in some automagical way, tmt being able to realize, something like "aha, this is a kernel panic, guest is rebooting, I shall restart the test!". All ideas we have eventually boil down to letting tmt know about it so it can cooperate with your test. See e.g. https://tmt.readthedocs.io/en/stable/spec/tests.html#restart, I'd say it fits your use case: restart-on-exit-code:
# this is the exit code tmt receives when SSH session - and the guest - die
# suddenly due to a crash
- 255
# I'd set this to `false`, your test already issues the reboot
restart-with-reboot: false This should tell tmt that it should wait for the reboot to pass, and reconnect and restart the test. |
Thanks for clarification! Because beaker job can resume the test after kernel panic automatically so I expect tmt to also support panic. I just tested mkdir .fmf
echo -n 1 > .fmf/version
cat << 'EOF' > main.fmf
/tests:
/basic:
restart-on-exit-code:
- 255
test: |
echo 2 > /proc/sys/kernel/panic
sync
if [ "$TMT_REBOOT_COUNT" == 0 ]; then
# tmt-reboot -c "echo c > /proc/sysrq-trigger"
echo c > /proc/sysrq-trigger
fi
echo "Test passed"
EOF
if tmt run -a provision -h virtual; then
echo "Test passed"
else
echo "Test failed"
fi The following logs with
|
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
They are relevant, but just a snippet of the full picture. To work correctly, your test needs to check Plus there is indeed one minor issue that may lead to errors, see #3291. Together with these two changes, I get an expected picture:
|
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
Even without a reboot, tmt still needs to verify the guest is up and running. The reboot might be triggered beyond the control of tmt, and that is fine, we just need to be sure we restart the test on guest that's alive. Related to #3284
After triggering a kernel panic, the system can be rebooted but the test just failed with error. I notice a workaround is to execute the kernel panic trigger command by
tmt-reboot
.Not if the test is written with beakerlib, a similar error will occur and something like
# the errr could also be 00:00:28 errr /client-test/tests/client (on client) (beakerlib: State 'imcomplete') [1/1]
will also be printed.Here are the logs and the reproducer.
Logs
Reproducer
The text was updated successfully, but these errors were encountered: