-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilize execution of test.sh on dev Linux environments: certifier_tests_wrap.cc:178:11: fatal error: Python.h: No such file or directory #215
Comments
I tried running
I guess the best way to go here is to provide a docker container with all of the dependencies that can be used for running tests. The Dockerfile can be used as a reference for what packages and configurations are needed if you want to run this on your physical machine. |
The Docker container is a good approach. Let's discuss this next week to put this work item in its priority queue. |
…envs. This commit does some clean-up of scripts and Makefiles so that the newly-added test.sh can be used by other dev engineers on diff Linux physical machines. - Fix few Makefiles to allow re-specifying LOCAL_LIB, to point to, say, /usr/local/lib (for SSL libs, e.g.) - Hacky support to allow use of OpenSSL V1.x versions - Skip a few pytests that seem to not run stably. Needs further investigation.
@rgerganov -- I investigated one of the points you raised above: waiting 120sec. for sockets to be closed; we should be able to fix this with SO_REUSEADDR? The sleep is coming in the pytest-case In
Empirically, testing evidence shows that even though this flag has been used, on dev-machine, if you re-run the test-case repeatedly, the socket is not being closed and re-used fast enough. Hence, this sleep has been added. This test file I know this is irritating, but I don't think this is functionally broken or use of this sleep is masking some broken functionality. I can investigate further, but debugging this is a bit difficult on CI machines. |
@yelvmw -- Here's the trace evidence showing why some of the Python tests in test_certifier_framework.py are hanging when run on the SGX machine. I think it's because in the shared library, the code for That I suspect is causing an issue as seen in the execution flow trace instrumentation:
Basically, the Python method invokes
I instrumented the last function, and it's hanging on L232 below:
To validate this theory that running SEV_SNP-specific code in this simulated-enclave, I conditionalized the definition of
This allows the tests to run cleanly without running into this hanging issue. |
…envs. This commit does some clean-up of scripts and Makefiles so that the newly-added test.sh can be used by other dev engineers on diff Linux physical machines. - Fix few Makefiles to allow re-specifying LOCAL_LIB, to point to, say, /usr/local/lib (for SSL libs, e.g.) - Hacky support to allow use of OpenSSL V1.x versions - Add NO_ENABLE_SEV=1 flag in certifier.mak, so that we can build shared library without #include'ing SEV_SNP-specific code. This seems to cause pytests which to "hang" as they end up calling SEV_SNP code-paths. - Use this new flag when building shared-libraries in the test-run_example-simple_app() test-case, in test.sh . This is used to execute pytests that need Certifier Service.
…envs. This commit does some clean-up of scripts and Makefiles so that the newly-added test.sh can be used by other dev engineers on diff Linux physical machines. - Fix few Makefiles to allow re-specifying LOCAL_LIB, to point to, say, /usr/local/lib (for SSL libs, e.g.) - Hacky support to allow use of OpenSSL V1.x versions - Add NO_ENABLE_SEV=1 flag in certifier.mak, so that we can build shared library without #include'ing SEV_SNP-specific code. This seems to cause pytests which to "hang" as they end up calling SEV_SNP code-paths. - Use this new flag when building shared-libraries in the test-run_example-simple_app() test-case, in test.sh . This is used to execute pytests that need Certifier Service. cleanup.
It's surprising to see OBJ_create hanging. It's an OpenSSL function. I checked the bugs section of the manual and only see the caveat of concurrent invocation. Is this the case? |
…nvs. This commit does some clean-up of scripts and Makefiles so that the newly-added test.sh can be used by other dev engineers on diff Linux physical machines. - Fix few Makefiles to allow re-specifying LOCAL_LIB, to point to, say, /usr/local/lib (for SSL libs, e.g.) - Hacky support to allow use of OpenSSL V1.x versions - Add NO_ENABLE_SEV=1 flag in certifier.mak, so that we can build shared library without #include'ing SEV_SNP-specific code. Including SEV_SNP-specific code seems to cause pytests to "hang", when running pytests on simulated-enclave, as the tests end up calling SEV_SNP code-paths. - test.sh: Use this new flag when building shared-libraries in the test-run_example-simple_app() test-case. This is used to execute pytests that need Certifier Service on SEV-enabled h/w.
@yelvmw - About the concurrent use of However in the affected test case(s) [e.g., |
…nvs. This commit does some clean-up of scripts and Makefiles so that the newly-added test.sh can be used by other dev engineers on diff Linux physical machines. - Fix few Makefiles to allow re-specifying LOCAL_LIB, to point to, say, /usr/local/lib (for SSL libs, e.g.) - Hacky support to allow use of OpenSSL V1.x versions - Add NO_ENABLE_SEV=1 flag in certifier.mak, so that we can build shared library without #include'ing SEV_SNP-specific code. Including SEV_SNP-specific code seems to cause pytests to "hang", when running pytests on simulated-enclave, as the tests end up calling SEV_SNP code-paths. - test.sh: Use this new flag when building shared-libraries in the test-run_example-simple_app() test-case. This is used to execute pytests that need Certifier Service on SEV-enabled h/w.
@rgerganov : I'm back to cleaning up some items you logged here: "test script deleting non-git files; this is not ok -"C Firstly, I do need Otherwise, I've found that some workflow steps for some sample-apps may interfere causing instability. I just need a way to "cleanup the entire dev env" at the start of each run of this script. Second, the command does What if we exclude Alternate Solution: Let me know if you think this is good enough: I took a look at the files that do need to be cleaned-up by this command. Seems like if we can filter out the output from This kind of filtering is prone to being incomplete ... but it may be the only reliable way to do this clean-up w/o impacting the user's config / env-files. -- The idea is that we only Do you expect user project's config/env-files to live in our project's Let me know your thoughts on this approach ... and whether we are really to be worried-about config / env-files in the certifier root-dir. |
This commit attempts to improve the behaviour of the rm_non_git_files sub-command of run_example.sh . The objective of this command is to clean-up build-outputs and other [stale] artifacts that may have been produced by a prior execution of this script. We may be, inadvertently, also deleting user's config-/env-files that reside in the project's sub-tree. This commit changes the behaviour of rm_non_git_files to now only delete the files registered in the .gitignore list.
This commit attempts to improve the behaviour of the rm_non_git_files sub-command of run_example.sh . The objective of this command is to clean-up build-outputs and other [stale] artifacts that may have been produced by a prior execution of this script. We may be, inadvertently, also deleting user's config-/env-files that reside in the project's sub-tree. This commit changes the behaviour of rm_non_git_files to now only delete the files registered in the .gitignore list. Also, now "--dry-run rm_non_git_files" will only print an info message without actually deleting all affected files.
This commit attempts to improve the behaviour of the rm_non_git_files sub-command of run_example.sh . The objective of this command is to clean-up build-outputs and other [stale] artifacts that may have been produced by a prior execution of this script. We may be, inadvertently, also deleting user's config-/env-files that reside in the project's sub-tree. This commit changes the behaviour of rm_non_git_files to now only delete the files registered in the .gitignore list. Also, now "--dry-run rm_non_git_files" will only print an info message without actually deleting all affected files.
The newly added
test.sh
script is not working correctly on engineers' Linux dev env.There are some hard-coded dependencies in various build scripts that need to be corrected.
There are at least a couple of diff issues to resolve under this issue:
PY_INCLUDE = -I /usr/include/python3.10/
needs to be fixed in all make files.LOCAL_LIB=/usr/local/lib64
fromtest.sh
to sub-scripts / makefiles.Here are some examples of failures:
The text was updated successfully, but these errors were encountered: