[SYCL][E2E] Add functionality to split build and run of e2e tests #16016

ayylol · 2024-11-07T17:58:51Z

Adds functionality to split the compilation and execution of e2e tests across separate machines.

This functionality is enabled via the test-mode lit parameter. By default this is set to full and tests are built and ran on the same system, just like before. setting test-mode to either run-only or build-only enables the test splitting.

Two new lit features have been added: run-mode which is enabled when either in run-only or full testing mode, and build-and-run-mode which is only enabled when in full testing mode.

When in build-only mode all tests that can be built on the system will be built. When running a lit test in this mode two key things change:

All RUN: lines will be executed unless they contain %{run}, %{run-unfiltered-devices} or %if run-mode.
Unsupported and requires statements are ignored. Currently the only way to mark tests as unsupported in build only mode is to include build-and-run-mode in a requires statement, or use UNSUPPORTED: true.

When in run-only mode tests are not built, they are only executed. Deciding whether a test is supported in this mode works the same as in full mode. To only execute the tests we ignore all lit RUN: lines unless they contain %{run}, %{run-unfiltered-devices} or %if run-mode. Since we do not build the tests in this mode, for any test to pass we must have ran the tests in either build-only or full modes.

Some notes/current limitations:

Currently only the spir64 triple is supported for the build only mode.
If a test is able to build, but fails during running we need to mark it as XFAILS: run-mode so that it does not XPASS when in build-only mode.
The build-only mode can be ran manually on CI with the SYCL E2E action, selecting the "Linux, build" runner, and adding --param test-mode=build-only to LIT_OPTS. This pr does not have the needed CI changes to be able to use the run only mode.

ayylol · 2024-11-07T18:02:30Z

@uditagarwal97 @sarnex @aelovikov-intel

Closed #15728 and reopened in this pr so that it is not a sycl-devops-pr and runs unnecessary checks.

Going to leave this marked as draft until you guys review and are ok with the changes in format.py and lit.cfg.py, then mark as open so that it pings the tests owners after

https://github.com/intel/llvm/actions/runs/11725963631 is the latest dry-run of build only

but are marked as XFAIL in different ways

Added a run-line to one test as well

sarnex · 2024-11-12T15:41:56Z

@ayylol Sorry just to double check, is this ready for review from us CI people or is it still in progress? Thx.

ayylol · 2024-11-12T15:49:25Z

@ayylol Sorry just to double check, is this ready for review from us CI people or is it still in progress? Thx.

@sarnex Yep, its ready for review. Im just leaving it as draft in case any of the changes to the underlying logic from addressing reviews leads to more/different test changes.

sarnex

some basic comments

sarnex · 2024-11-12T15:52:24Z

sycl/test-e2e/AmdNvidiaJIT/kernel_and_bundle.cpp

@@ -1,5 +1,6 @@
 // UNSUPPORTED: windows
 // REQUIRES: cuda || hip
+// REQUIRES: unsplit-mode


nit: we might want something more descriptive here, if possible UNSUPPORTED: split-test-build-and-run or something like that

Will address this comment last to leave open to bike-shedding. Would something like unsplit-test-mode work? or unsplit-testing, should the other features (build-mode, run-mode) follow suit? or are they clear enough.

I don't think i'd want to put the word "split" in this particular feature, since it is exactly what it isn't.

yeah of course i dont want to start bikeshedding, REQUIRES: unsplit-test-mode is fine with me

Won't it be clearer to have something like only-build-test, only-run-test, build-and-run-test instead of build-mode, run-mode, and unsplit-test-mode?

Won't it be clearer to have something like only-build-test, only-run-test, build-and-run-test instead of build-mode, run-mode, and unsplit-test-mode?

The run-mode feature is useful because it allows us to have common behaviour between the run-only and unsplit modes, if we didnt have it we would have to mark tests that xfail at run-time as both XFAIL on only-run-test and build-and-run-test.

The build-mode feature I could not think of a valid use for, since if we have an xfail at build time we would always also want to mark it as an xfail at run-time since we wouldn't have an executable, also a test in the e2e folder that uses REQUIRES: build-mode seems a bit suspicious.

I added something like build-only/run-only/unsplit as a variable within the python scripts, but not as a feature since their usefulness in that context is unclear to me. Rather I'm using them to clean up if statements in the python scripts where we want to check what split-mode we are in.

sycl/test-e2e/DeviceArchitecture/device_architecture_comparison_on_device_aot.cpp

sycl/test-e2e/InvokeSimd/Spec/ImplicitSubgroup/tuple.cpp

sarnex · 2024-11-12T15:56:14Z

sycl/test-e2e/Matrix/SG32/joint_matrix_colA_rowB_colC.cpp

@@ -11,7 +11,7 @@
 // RUN: %{build} -o %t.out
 // RUN: %{run} %t.out

-// XFAIL:*
+// XFAIL: run-mode


sorry if i asked this before but i wonder if we can implement this inside the python scripts instead of requiring test changes

These changes are to differentiate between tests that fail when compiling vs tests that fail when running. These tests originally are able to build so marking as XFAIL: run-mode makes sure these tests arent reported as XPASS when building only.

I want (very strongly) to start with explicit tests markup first before we introduce any heuristics in the lit logic, especially if that's really heuristics and not some bullet-proof correct logic.

i guess im just worried this feature will slow down/confuse developers writing tests, im willing to proceed with this as-is and see what happens, but we should be ready to revert/fix the problem fast if some issue comes up

i guess im just worried this feature will slow down/confuse developers writing tests, im willing to proceed with this as-is and see what happens, but we should be ready to revert/fix the problem fast if some issue comes up

For now since there is no automatic running on CI enabled, we'll just have to monitor the failures manually.

However once this is integrated into the post/pre commit I think it should be pretty straightforward to figure out the appropriate action given the ci's behaviour:

If we see that a test passes the build stage but fails the run stage we have a runfail, and if this is expected we would mark it as XFAIL/UNSUPPORTED on run-mode.

If it fails on both the build and run stage then we have a compfail and we would mark as XFAIL/UNSUPPORTED * (Or we could mark as XFAIL for a feature that is non-device-specific, like linux).

sarnex · 2024-11-12T15:58:48Z

sycl/test-e2e/format.py

-            (backend, _) = sycl_device.split(":")
-            triples.add(get_triple(test, backend))
+        if "run-mode" not in test.config.available_features:
+            if "unsplit-mode" in test.requires or "TEMPORARY_DISABLED" in test.requires:


maybe im reading the lit.cfg.py change wrong but isn't run-mode && !unsplit-mode impossible? If it's intended to be an error check I'm not sure how the user could ever configure lit to be in that state

This bit is for build-only mode, so that only the tests that are marked as REQUIRES: unsplit-mode or REQUIRES: TEMPORARY_DISABLED are reported as unsupported.

I guess it might be useful if I added a feature for run-only and build-only to avoid confusion in if statements like this

ah sorry, i missed that we were checking the test requires,thx

I added the variable test.config.split_mode to hopefully make these types of if statements less confusing to read. (replaced "run-mode" not in available_features to split_mode == "build-only" )

uditagarwal97

lit.cfg.py and format.py changes LGTM too. Just one minor comment.

sycl/test-e2e/lit.cfg.py

ayylol · 2024-11-15T19:51:37Z

Closed the following PRs to include their changes in this PR: #15960, #15950, #15891, #15787, #15789

Doing this all in one pr so that all the changes needed to cleanly run the build-only mode on ci are included in one single commit.

YuriPlyakhin

Joint Matrix changes LGTM

againull · 2024-11-20T18:11:16Z

@intel/bindless-images-reviewers @intel/syclcompat-lib-reviewers @intel/unified-runtime-reviewers Friendly ping.

sycl/test-e2e/DeviceArchitecture/device_architecture_comparison_on_device_aot.cpp

ayylol requested review from sarnex, uditagarwal97 and aelovikov-intel November 7, 2024 17:58

ayylol had a problem deploying to WindowsCILock November 7, 2024 17:59 — with GitHub Actions Error

ayylol mentioned this pull request Nov 7, 2024

[SYCL][E2E] Add functionality to split build and run of e2e tests #15728

Closed

ayylol temporarily deployed to WindowsCILock November 7, 2024 18:10 — with GitHub Actions Inactive

Add functionality to split build and run of e2e tests

7432d1c

ayylol had a problem deploying to WindowsCILock November 7, 2024 18:52 — with GitHub Actions Error

ayylol added 5 commits November 7, 2024 10:54

Add REQUIRES: unsplit-mode to tests that failed building

be37e41

Add XFAIL: run-mode to xfail tests that are able to build

be6805f

Add XFAIL: * to tests that fail building

10d2c4c

but are marked as XFAIL in different ways

Add %if run-mode to lines that should not run in build-mode

4eb0d31

Add %{run} to tests that did not use it

0ed6008

Added a run-line to one test as well

ayylol force-pushed the e2e-split branch from eb7e7ac to 0ed6008 Compare November 7, 2024 19:00

ayylol temporarily deployed to WindowsCILock November 7, 2024 19:01 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock November 7, 2024 19:17 — with GitHub Actions Inactive

Remove added RUN line from test

dd77b6d

ayylol temporarily deployed to WindowsCILock November 7, 2024 20:00 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock November 7, 2024 20:26 — with GitHub Actions Inactive

sarnex reviewed Nov 12, 2024

View reviewed changes

ayylol added 3 commits November 12, 2024 10:33

Merge branch 'sycl' into e2e-split

364994d

Change XFAIL for DeviceArchitecture test

29048da

Replace unsplit-mode with unsplit-test-mode

b8e6cfc

ayylol temporarily deployed to WindowsCILock November 12, 2024 19:05 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock November 12, 2024 20:05 — with GitHub Actions Inactive

Add test.config.split_mode variable to clean up if statements

0f19c31

ayylol temporarily deployed to WindowsCILock November 13, 2024 14:29 — with GitHub Actions Inactive

uditagarwal97 reviewed Nov 15, 2024

View reviewed changes

sycl/test-e2e/lit.cfg.py Outdated Show resolved Hide resolved

Address reviewer comments

5fcb771

ayylol temporarily deployed to WindowsCILock November 15, 2024 19:36 — with GitHub Actions Inactive

ayylol marked this pull request as ready for review November 15, 2024 19:51

ayylol requested review from a team as code owners November 15, 2024 19:51

ayylol requested a review from uditagarwal97 November 15, 2024 19:51

sarnex approved these changes Nov 15, 2024

View reviewed changes

ayylol temporarily deployed to WindowsCILock November 15, 2024 20:11 — with GitHub Actions Inactive

YuriPlyakhin approved these changes Nov 15, 2024

View reviewed changes

Merge branch 'sycl' into e2e-split

f3fc08f

ayylol temporarily deployed to WindowsCILock November 20, 2024 18:21 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock November 20, 2024 19:24 — with GitHub Actions Inactive

Alcpz approved these changes Nov 21, 2024

View reviewed changes

aarongreig approved these changes Nov 21, 2024

View reviewed changes

Merge branch 'sycl' into e2e-split

f1b98c3

ayylol temporarily deployed to WindowsCILock November 21, 2024 14:23 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock November 21, 2024 14:49 — with GitHub Actions Inactive

ProGTX approved these changes Nov 22, 2024

View reviewed changes

againull merged commit f279a8a into sycl Nov 22, 2024
14 checks passed

ayylol deleted the e2e-split branch November 22, 2024 15:06

uditagarwal97 reviewed Nov 22, 2024

View reviewed changes

sycl/test-e2e/DeviceArchitecture/device_architecture_comparison_on_device_aot.cpp Show resolved Hide resolved

0x12CC mentioned this pull request Nov 28, 2024

Structural pattern matching in lit.cfg.py #16211

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][E2E] Add functionality to split build and run of e2e tests #16016

[SYCL][E2E] Add functionality to split build and run of e2e tests #16016

ayylol commented Nov 7, 2024 •

edited

Loading

ayylol commented Nov 7, 2024

sarnex commented Nov 12, 2024

ayylol commented Nov 12, 2024

sarnex left a comment

sarnex Nov 12, 2024

ayylol Nov 12, 2024 •

edited

Loading

sarnex Nov 12, 2024

uditagarwal97 Nov 12, 2024

ayylol Nov 13, 2024

sarnex Nov 12, 2024

ayylol Nov 12, 2024

aelovikov-intel Nov 12, 2024

sarnex Nov 12, 2024

ayylol Nov 13, 2024

sarnex Nov 12, 2024

ayylol Nov 12, 2024

sarnex Nov 12, 2024 •

edited

Loading

ayylol Nov 13, 2024

uditagarwal97 left a comment

ayylol commented Nov 15, 2024

YuriPlyakhin left a comment

againull commented Nov 20, 2024

[SYCL][E2E] Add functionality to split build and run of e2e tests #16016

[SYCL][E2E] Add functionality to split build and run of e2e tests #16016

Conversation

ayylol commented Nov 7, 2024 • edited Loading

Some notes/current limitations:

ayylol commented Nov 7, 2024

sarnex commented Nov 12, 2024

ayylol commented Nov 12, 2024

sarnex left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ayylol Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sarnex Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uditagarwal97 left a comment

Choose a reason for hiding this comment

ayylol commented Nov 15, 2024

YuriPlyakhin left a comment

Choose a reason for hiding this comment

againull commented Nov 20, 2024

ayylol commented Nov 7, 2024 •

edited

Loading

ayylol Nov 12, 2024 •

edited

Loading

sarnex Nov 12, 2024 •

edited

Loading