-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] empty kernel logs on MAX_WAIT_FW_LOADING failures #1112
Comments
Still a problem: |
We had a surge of FW failing to load for various reasons:
We need to fix this, first (and small) part: |
I just noticed #1036 by chance, possibly a duplicate |
Stop unconditionally imposing start_test() to everything sourcing lib.sh This makes it possible to source lib.sh without running anything which can be useful for debugging. For now this should be a no-op for all tests except two files in `tools/` which were never really "tests" in the first place: - tools/kmod/sof_remove.sh - tools/sof-kernel-log-check.sh In these, start_test() never made sense it never really run thanks the (awkward) "is_subtest()" escape. There are other tests where start_test() should not be invoked either, most notably test-case/verify-kernel-boot-log.sh in order to fix thesofproject#1112 but this is a huge commit already; one change at a time. Signed-off-by: Marc Herbert <[email protected]>
Stop unconditionally imposing start_test() to everything sourcing lib.sh This makes it possible to source lib.sh without running anything which can be useful for debugging. For now this should be a no-op for all tests except two files in `tools/` which were never really "tests" in the first place: - tools/kmod/sof_remove.sh - tools/sof-kernel-log-check.sh In these, start_test() never made sense it never really run thanks the (awkward) "is_subtest()" escape. There are other tests where start_test() should not be invoked either, most notably test-case/verify-kernel-boot-log.sh in order to fix thesofproject#1112 but this is a huge commit already; one change at a time. Signed-off-by: Marc Herbert <[email protected]>
Provide a new, boot_logs.txt file no matter what happens. Note that file now includes user-space logs, not just kernel logs. Also: fix bug where script times out and does not run when the SOF firmware is not loaded: no need to define NO_POLL_FW_LOADING anymore. The trick is to stop calling `start_test()`. verify-kernel-boot-log.sh is not an audio test! Fixes sof-test issues thesofproject#1036 and thesofproject#1112, find more details there. Signed-off-by: Marc Herbert <[email protected]>
Provide a new, boot_logs.txt file no matter what happens. Note that file now includes user-space logs, not just kernel logs. Also: fix bug where script times out and does not run when the SOF firmware is not loaded: no need to define NO_POLL_FW_LOADING anymore. The trick is to stop calling `start_test()`. verify-kernel-boot-log.sh is not an audio test! Fixes sof-test issues #1036 and #1112, find more details there. Signed-off-by: Marc Herbert <[email protected]>
Lucky us we just got a complete, real-world sample case here where the firmware did not load (topology mismatch) https://sof-ci.01.org/linuxpr/PR4995/build3008/devicetest/index.html The recent, The other, consecutive tests were not too bad. Only 2 CI TIMEOUTS, all other tests timed out after MAX_WAIT_FW_LOADING. The lack of kernel logs in consecutive tests is expected after all: nothing new to show. |
i915 does not hang MTL any more, this has been fixed a long time ago. On the other hand, 70s is a very long time to wait when the firmware fails to load for some reason; this is wasting precious cycles. See recent example in: https://sof-ci.01.org/linuxpr/PR4995/build3008/devicetest/index.html thesofproject#1112 (comment) Signed-off-by: Marc Herbert <[email protected]>
That's only because MTL still has an unreasonably long MAX_WAIT_FW_LOADING, otherwise there would have not been any CI timeout. Fixed in: |
i915 does not hang MTL any more, this has been fixed a long time ago. On the other hand, 70s is a very long time to wait when the firmware fails to load for some reason; this is wasting precious cycles. See recent example in: https://sof-ci.01.org/linuxpr/PR4995/build3008/devicetest/index.html #1112 (comment) Signed-off-by: Marc Herbert <[email protected]>
#1196 merged, good enough. |
Describe the bug
#1059 added the ability to wait for the firmware to be loaded, which solved a number of issues, including some i915 timeout issues.
However sof-test does not collect kernel logs when this fails. This is very problematic because the only way to know why the firmware is not loaded is to look at kernel logs.
To Reproduce
sof-test/tools/kmod/sof_remove.sh
=> dmesg.txt is empty
Expected behavior
Kernel logs are collected.
Detail Info
cc:
The text was updated successfully, but these errors were encountered: