Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boot test and virtual heap testing #7688

Merged
merged 4 commits into from
Dec 8, 2023
Merged

Conversation

lyakh
Copy link
Collaborator

@lyakh lyakh commented May 25, 2023

No description provided.

@lyakh
Copy link
Collaborator Author

lyakh commented Jun 6, 2023

SOFCI TEST

1 similar comment
@lyakh
Copy link
Collaborator Author

lyakh commented Jun 6, 2023

SOFCI TEST

@lyakh lyakh force-pushed the testvmh branch 6 times, most recently from 2d2fc61 to 5a11244 Compare July 27, 2023 10:33
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dabekjakub is this test now passing with latest VMH PR ?

@lyakh lyakh changed the title [DNM] boot test and virtual heap testing boot test and virtual heap testing Nov 16, 2023
@lyakh lyakh marked this pull request as ready for review November 16, 2023 17:00
@marc-hb marc-hb requested a review from dabekjakub November 16, 2023 19:02
Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need yet another, brand new test framework? Genuine question. Can't Zephyr's ZTEST and twister help with this?

https://docs.zephyrproject.org/latest/develop/test/ztest.html

cc: @teburd , @dcpleung

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 17, 2023

Do we really need yet another, brand new test framework? Genuine question. Can't Zephyr's ZTEST and twister help with this?

https://docs.zephyrproject.org/latest/develop/test/ztest.html

cc: @teburd , @dcpleung

@marc-hb as far as I understand - no, not easily, ztest seems to be (mostly) used with twister when building a separate image just for testing, whereas what we want is a kind of a boot self-test, similar to what the kernel can do. I'd much prefer to use an existing framework and if ztest or another Zephyr API can do that - I'd love to know about it and to switch to it! Or maybe we want to add boot-time self-testing to Zephyr?

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 17, 2023

@dabekjakub is this test now passing with latest VMH PR ?

@lgirdwood no, I see a failure in the log, I'll double check whether I'm calling the API wrongly or if there's a problem in VMH.

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 17, 2023

@dabekjakub is this test now passing with latest VMH PR ?

@lgirdwood no, I see a failure in the log, I'll double check whether I'm calling the API wrongly or if there's a problem in VMH.

Unfortunately all https://sof-ci.01.org/sofpr/PR7688/build14963/devicetest/index.html test results are empty, but in my local test I got this:

[    0.135591] <inf> pipe: pipeline_new: pipeline new pipe_id 0 priority 0
[    0.135743] <wrn> vmhtest: vmh_test_single: Ignoring failure to allocate 1616 in non-contiguous mode
[    0.135750] <err> vmhtest: vmh_test_single: Failed to allocate 26
[    0.135946] <err> vmhtest: vmh_test: Non-contiguous test error -12
[    0.135951] <inf> boot_test: sof_boot_test: test 0xa0076534 returned -12

So either I'm testing wrongly or there's a problem with VMH, @dabekjakub ?

@teburd
Copy link
Contributor

teburd commented Nov 17, 2023

Do we really need yet another, brand new test framework? Genuine question. Can't Zephyr's ZTEST and twister help with this?
https://docs.zephyrproject.org/latest/develop/test/ztest.html
cc: @teburd , @dcpleung

@marc-hb as far as I understand - no, not easily, ztest seems to be (mostly) used with twister when building a separate image just for testing, whereas what we want is a kind of a boot self-test, similar to what the kernel can do. I'd much prefer to use an existing framework and if ztest or another Zephyr API can do that - I'd love to know about it and to switch to it! Or maybe we want to add boot-time self-testing to Zephyr?

ztest should be usable outside of twister, its a way of structuring test code and checks (asserts)

You can run any test in the zephyr/tests directory by building and flashing normally with west, nothing particularly special about twister doing this.

And you can decide when and how to run but by default it creates a main to do this for you.

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 20, 2023

ztest should be usable outside of twister, its a way of structuring test code and checks (asserts)

You can run any test in the zephyr/tests directory buy building and flashing normally with west, nothing particularly special about twister doing this.

And you can decide when and how to run but by default it creates a main to do this for you.

@teburd right, thanks. My understanding is that maybe it can be used but it hardly would bring anything in this configuration. What we need is to run multiple tests at SOF startup time. So it should be an otherwise normal SOF build, having run those tests it will proceed and function normally as any other SOF firmware. And it seems to me that we need to run them at a specific point in SOF run-time sequence, long after all Zephyr initialisation has completed. What advantages would we still gain if using ztest - if at all possible? We'd need to make a new launch point inside of SOF too, right?

@lgirdwood
Copy link
Member

This also need to be used by the SOF application so that after a normal FW boot we enter the Ztest logic and run thru all the UTs. The FW boot complete IPC messaging can then be used to indicate test results. The kernel would know about Ztest mode with an extended manifest flag.

@lgirdwood the way it's currently done - yes, this framework can be used from other SOF code paths too. Not sure what you mean by "all the UTs" - new or existing ones? You can write new ones or possibly adapt tests from the existing testbench / cmocka sets. As for the kernel knowing about that - so far it's decided by a Kconfig option and the kernel doesn't need to know anything about it. But sure, we can extend it in the future

Today we have very little UTs, so yes, we can add more and also add existing cmocka tests where appropriate (and if posible).

Kconfig needs to set a flag in the ext manifest so that kernel knows "this is a test"

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - ztest "hello world"
@lyakh dont see the extended manifest flag yet. How do we report failures when a test does not assert() or cause an exception ? e.g. will everything go out via trace

test 0: feature1 vmh.c:54 - pass
test 1: feature2 vmh.c:87 - failed - got 97 expected 78.
....
test: all test completed 89 passed 3 failed

zephyr/Kconfig Outdated
config SOF_BOOT_TEST
bool "enable SOF run-time testing"
depends on ZTEST
help
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we expand this more and tell the users that its for testing use only and should not be enabled for normal audio usage. i.e what this enables/disables and impact.

if (!h)
return -ENOMEM;

return vmh_free_heap(h);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh @dabekjakub this is a simple test today i.e. "hello world", but lets add to it and stress it when we have the ztest infra understood and merged (as I think we all have good test ideas baking....).

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 28, 2023

LGTM - ztest "hello world" @lyakh dont see the extended manifest flag yet.

@lgirdwood can we discuss and add manifest and IPC reporting as follow-up PRs? Manifest is needed to make sure that the kernel is expecting any eventual delays?

How do we report failures when a test does not assert() or cause an exception ? e.g. will everything go out via trace

test 0: feature1 vmh.c:54 - pass
test 1: feature2 vmh.c:87 - failed - got 97 expected 78.
....
test: all test completed 89 passed 3 failed

Yes, logging, unfortunately I don't see any CI device results in this PR, so copying from my local testing (possibly of a modified version):

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.201955] <inf> sof_boot_test: vmh_test: vmh init good
 PASS - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot succeeded

@lgirdwood
Copy link
Member

Yes, logging, unfortunately I don't see any CI device results in this PR, so copying from my local testing (possibly of a modified version):

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.201955] <inf> sof_boot_test: vmh_test: vmh init good
 PASS - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot succeeded

I think we need to list each UT executed and its result as a 1st step (the total pass/fail can come later). Th point is that all tests should be visible and result recorded.

@lyakh
Copy link
Collaborator Author

lyakh commented Nov 30, 2023

Yes, logging, unfortunately I don't see any CI device results in this PR, so copying from my local testing (possibly of a modified version):

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.201955] <inf> sof_boot_test: vmh_test: vmh init good
 PASS - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot succeeded

I think we need to list each UT executed and its result as a 1st step (the total pass/fail can come later). Th point is that all tests should be visible and result recorded.

@lgirdwood you mean list them in the log? The above output is Zephyr's ZTEST() output, onle the "vmh init good" line is mine and it's actually not in this PR, so in fact imagine, that it isn't there - that's my local modification. But otherwise we want to use Zephyr standard testing, including their standard reporting, right? Of course, we can add any IPC reporting on top, in follow up PRs

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 1, 2023

@lyakh @lgirdwood I think the log output goes to logging subsystem as normal so it looks like this:
https://sof-ci.01.org/sofpr/PR7688/build423/devicetest/index.html?model=ADLP_RVP_NOCODEC-ipc4&testcase=check-playback-10sec

*** Booting Zephyr OS build zephyr-v3.5.0-2009-gefc32081893d ***
[    0.034310] <inf> main: sof_app_main: SOF on intel_adsp_cavs25
[    0.034321] <inf> main: sof_app_main: SOF initialized
[    0.034770] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x31400008
[    0.035353] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x3060004c
[    0.037053] <inf> ipc: ipc_cmd: rx	: 0x11000005|0x0
[    0.037066] <inf> pipe: pipeline_new: pipeline new pipe_id 0 priority 0
Running TESTSUITE sof_boot
===================================================================
TESTSUITE sof_boot succeeded
[    0.037521] <inf> ipc: ipc_cmd: rx	: 0x40000004|0x15

I'm not sure what happens if the test fails, that should probably be tested. We need some means to observe what happens if the test fails. @lyakh ?

@lyakh
Copy link
Collaborator Author

lyakh commented Dec 1, 2023

I'm not sure what happens if the test fails, that should probably be tested. We need some means to observe what happens if the test fails. @lyakh ?

@kv2019i That would be a separate PR to sof-test? But I'm not sure if we rely on parsing the log, if not, we need a notification IPC and a debugfs entry in the kernel?

@lyakh
Copy link
Collaborator Author

lyakh commented Dec 1, 2023

I'm not sure what happens if the test fails

@kv2019i a failure looks like

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.488501] <wrn> sof_boot_test: vmh_test_single: Ignoring failure to allocate 1616 in non-contiguous mode
[    0.488508] <err> sof_boot_test: vmh_test_single: Failed to allocate 26
[    0.488630] <err> sof_boot_test: vmh_test: Non-contiguous test error -12
[    0.488635] <err> sof_boot_test: sof_boot_vmh: vmh init failed: -12
 FAIL - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot failed.

@lgirdwood
Copy link
Member

Yes, logging, unfortunately I don't see any CI device results in this PR, so copying from my local testing (possibly of a modified version):

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.201955] <inf> sof_boot_test: vmh_test: vmh init good
 PASS - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot succeeded

I think we need to list each UT executed and its result as a 1st step (the total pass/fail can come later). Th point is that all tests should be visible and result recorded.

@lgirdwood you mean list them in the log? The above output is Zephyr's ZTEST() output, onle the "vmh init good" line is mine and it's actually not in this PR, so in fact imagine, that it isn't there - that's my local modification. But otherwise we want to use Zephyr standard testing, including their standard reporting, right? Of course, we can add any IPC reporting on top, in follow up PRs

Yes, pls use ZTEST() API, but this needs to at least print the line number and file name otherwise its less helpful than using original log().

@lyakh
Copy link
Collaborator Author

lyakh commented Dec 5, 2023

Yes, logging, unfortunately I don't see any CI device results in this PR, so copying from my local testing (possibly of a modified version):

Running TESTSUITE sof_boot
===================================================================
START - vmh
[    0.201955] <inf> sof_boot_test: vmh_test: vmh init good
 PASS - vmh in 0.001 seconds
===================================================================
TESTSUITE sof_boot succeeded

I think we need to list each UT executed and its result as a 1st step (the total pass/fail can come later). Th point is that all tests should be visible and result recorded.

@lgirdwood you mean list them in the log? The above output is Zephyr's ZTEST() output, onle the "vmh init good" line is mine and it's actually not in this PR, so in fact imagine, that it isn't there - that's my local modification. But otherwise we want to use Zephyr standard testing, including their standard reporting, right? Of course, we can add any IPC reporting on top, in follow up PRs

Yes, pls use ZTEST() API, but this needs to at least print the line number and file name otherwise its less helpful than using original log().

@lgirdwood as you see above there's "vmh" in ZTEST() prints:

START - vmh
 PASS - vmh in 0.001 seconds

that is the name of the test. We can agree to give more descriptive names, containing at least the file name, and if there are more than one tests in a file, then also an indication of which one this is. Would that be enough?

} else {
ztest_test_pass();
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, thinking about this now, this code would probably be common for all SOF boot tests, so we can have a macro to be called in this case something like SOF_ZTEST_BOOT(vmh, vmh_test); I can try to do this now or we can try to switch to one like this later when we get at least a second such boot test

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lgirdwood
Copy link
Member

Yes, pls use ZTEST() API, but this needs to at least print the line number and file name otherwise its less helpful than using original log().

@lgirdwood as you see above there's "vmh" in ZTEST() prints:

START - vmh
 PASS - vmh in 0.001 seconds

that is the name of the test. We can agree to give more descriptive names, containing at least the file name, and if there are more than one tests in a file, then also an indication of which one this is. Would that be enough?

As long as all tests are uniquely named with easy discovery then I'm good.

@lgirdwood
Copy link
Member

@lyakh can you check CI. I assume this is disabled in default Kconfig as this would obviously fail a CI audio test.


if (CONFIG_ACE_VERSION_1_5)
zephyr_library_sources_ifdef(CONFIG_SOF_BOOT_TEST
test/vmh.c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_subdirectory(test/)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

lyakh added 4 commits December 6, 2023 17:46
Add a ZTEST test-suite to run SOF boot-time self-tests at the time of
the first FW_GEN_MSG IPC.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
This adds an initial Virtual Memory Heap test.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
Add calls to vmh_alloc() and vmh_free() for testing.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
Test both cases - when spanning multiple blocks is allowed and when
it isn't.

Signed-off-by: Guennadi Liakhovetski <[email protected]>
@lyakh
Copy link
Collaborator Author

lyakh commented Dec 8, 2023

The only CI complaints are style - don't think using ret as a macro parameter can cause problems, and unrelated kernel error messages in main-ace

@kv2019i kv2019i merged commit dba33cc into thesofproject:main Dec 8, 2023
42 of 44 checks passed
@lyakh lyakh deleted the testvmh branch December 8, 2023 14:15
@marc-hb
Copy link
Collaborator

marc-hb commented Dec 13, 2023

From the moment this PR enabled this test, it has always failed on MTL

[    0.129560] <inf> ipc: ipc_cmd: rx	: 0x44000000|0x3060004c
[    0.405513] <inf> ipc: ipc_cmd: rx	: 0x11000005|0x0
[    0.405533] <inf> pipe: pipeline_new: pipeline new pipe_id 0 priority 0
Running TESTSUITE sof_boot
===================================================================
START - virtual_memory_heap
[    0.505716] <wrn> sof_boot_test: vmh_test_single: Ignoring failure to allocate 1616 in non-contiguous mode
[    0.505723] <err> sof_boot_test: vmh_test_single: Failed to allocate 26
[    0.505845] <err> sof_boot_test: vmh_test: Non-contiguous test error -12
[    0.505850] <err> sof_boot_test: sof_boot_virtual_memory_heap: virtual_memory_heap failed: -12
 FAIL - virtual_memory_heap in 0.001 seconds
===================================================================
TESTSUITE sof_boot failed.
[    0.606210] <inf> ipc: ipc_cmd: rx	: 0x40000004|0x15
[    0.606253] <inf> dma: dma_get: dma_get() ID 0 sref = 1 busy channels 0
[    0.607026] <inf> ipc: ipc_cmd: rx	: 0x40000006|0x10

(I think it always passed on cavs25)

We never noticed because we never paid attention to FW errors:

Now it's causing a DSP panic when combined with another change:

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 14, 2023

Temporarily disabling this test in:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants