[SYCL][Graph] Implementation of whole graph update #365

fabiomestre · 2024-03-31T21:08:43Z

No description provided.

…pp (intel#13171) All it's doing is setting doubleGRF, just do that using the first-class API. Manually tested this on Win. --------- Signed-off-by: Sarnie, Nick <[email protected]>

Tested manually on Win/Lin with many runs, doesn't hang anymore. Closes: intel#8815 Signed-off-by: Sarnie, Nick <[email protected]>

…l/core.hpp> (intel#13131)

per OSSF (https://securityscorecards.dev/viewer/?uri=github.com/intel/llvm) all workflows should have default top level permission set. Which we set to below as per recommendation permissions: contents: read then within actual jobs, when needed, we added additional privileges. These changes were generated by the recommended OSSF tool This PR changes those workflows created/owned by intel/llvm repo. Will do seperate PR for issues found in llvm/llvm-project inherited workflows.

…ported on Native CPU (intel#13109) Similarly to what is done for `nvptx` in intel#13015, Native CPU maps `private` and `generic` to the same address spaces, so we need to avoid getting multiple definitions for the libclc builtins that use `generic`.

oneapi-src/unified-runtime#1363

Previously we were hard-coding an -O2 optimization level for the 'signbit' builtin for all versions of GCC. Despite this workaround, I found locally that I was unable to build with GCC versions 12.2, 12.3, and 13.2. Reducing the optimization level to -O1 allowed me to progress. This seems to follow the bug report already linked, which had test cases at -O2 which were also failing. With this in mind, we can also restrict the GCC versions we apply the workaround to, so that more modern compilers should "just work" without us having to do anything. That should save someone having to investigate a performance report a year or so down the line...

This commit fixes the problem of missing build dependencies between libclc source files and their various includes. We would like to do this with compiler-generated dependency files because then the dependencies are accurate and there are no false positives, leading to unnecessary rebuilds. This is how regular C/C++ dependencies are usually tracked by CMake. Note that this variable is an internal API so is not guaranteed to work, but then again *all* of CMake's support for new languages (which we use for CLC/LL languages) is an internal API. On balance this change is probably worth it due to how minimally invasive it is. The alternative would be to either: 1. list/glob all possible files in the directory as dependencies, which would lead to false positives. 2. rewrite the library generation as a loop over all files and calling `add_custom_command`, which can produce a dependency file (by tweaking our clang command line) that can also be fed back to the same command via the `DEPFILE` argument. This would be a much larger change and is not as "neat".

When a non-blocking pipe operation fails, CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST is expected. The runtime needs to handle that case instead of throwing the exception.

the OSSF tool sucks and don't use its recommended default settings. It suggested permissions content:read as default, but that broke most of our workflows, instead use the GitHub recommended permissions: read-all

…13045) XPTI has unit tests that time the cost of each individual framework action, but an E2E timing test isn't available. This PR adds a new sample collector that shows how data can be pulled from the SYCL runtime using XPTI and provides timing information for the callback handler costs/event. Allows: 1. Zero cost application with XPTI_TRACE_ENABLE=0 2. Zero cost callback handlers when run in calibration mode 3. Full E2E test when run with "--format none" which gives the average cost of callback handlers/event --------- Signed-off-by: Vasanth Tovinkere <[email protected]>

…ements (intel#13019) We have a report of persistent cache failures. Traced to the directory creation so I switched it to use C++17 std::filesystem routines for `OSUtil::makeDir`. Also improved trace reporting.

…ntel#13196) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>

…oading (intel#13083) Based on discussions with various stakeholders, we concluded that spirv32/spirv64 are the best-suited strings for target architectures when user wants to generate JIT code for Intel backends using DPCPP compiler. This PR adds changes to allow the DPCPP compiler to accept spirv32/spirv64 as valid target architecture strings. spir/spir64 are also valid target architecture strings, but will be deprecated in a future commit. This change will help us to align with the SPIR-V backend behavior and ensure smoother SYCL upstreaming. Currently, only JIT triples using spirv32/spirv64 are supported. AOT triples using spirv32/spirv64 will be added soon. Thanks --------- Signed-off-by: Sudarsanam, Arvind <[email protected]>

Updates the git tag for the oneAPI Construction Kit.

Replace check for cv-unqualified object types with a check for cv-unqualified trivial types to be in line with the `sycl_ext_oneapi_private_alloca` extension specification: > `ElementType` must be a cv-unqualified trivial type --------- Signed-off-by: Victor Perez <[email protected]>

…ntel#13202) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>

Implementing the get_backend_info() functions for our SYCL implementation based on SYCL 2020 spec. (Link here: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html you may search for "get_backend_info()" there for the spec for these functions) There're six groups of variations for this function, namely `sycl::platform::get_backend_info()`, `sycl::context::get_backend_info()`, `sycl::device::get_backend_info()`, `sycl::queue::get_backend_info()`, `sycl::eventv::get_backend_info()`, and `sycl::kernel::get_backend_info()` One known concern: it seems that sycl::platform, sycl::context and sycl::kernel may have multiple associated device, but according to the spec the return type for `sycl::xxx::get_backend_info<info::device::version>()` should be std::string (i.e. a single device version) so I'm just returning the version of the first associated device in the list. Is this OK? --------- Signed-off-by: Hu, Peisen <[email protected]>

* Update the test to initialize the input vectors with 0s to match `bindless_helpers::fill_rand` requirement of non empty vector. * Change the name of function `initVector` to `init_vector`. * move `init_vector`, `equal_vec` and `operator<<` in header `bindless_helpers.hpp`.

…able to return result into a different matrix (intel#13151) Currently, CUDA code that use this pattern: for (int i = 0; i < c_frag.num_elements; i++) { c_frag.x[i] = alpha * acc_frag.x[i] + beta * c_frag.x[i]; } cannot be migrated to SYCL joint matrix. This added overload addresses this. Spec API is added here intel#13153

After intel@370aa2a grf_size control values changed to 128 and 256 values instead of values like "small", "large". > 2) Adds two new kernel properties > `sycl::ext::intel::experimental::grf_size` and > `sycl::ext::intel::experimental::grf_size_automatic`, as per the spec. > `grf_size` adds the `sycl-grf-size` metadata with a value of the > template parameter **(`128` or `256`)**. `grf_size_automatic` adds the > `sycl-grf-size` metadata with a value of `0`. and user is expected to specify value like this: syclex::properties kernel_properties{intelex::grf_size<128>}; syclex::properties kernel_properties{intelex::grf_size<256>};

Apply clang-format to llvm.bitreverse lowering testcase --------- Signed-off-by: Lu, John <[email protected]>

This is the 1st PR in prepare of enabling dev IGC test for some of the SYCL tests. Ref: intel#11552 Tested https://github.com/intel/llvm/actions/runs/8461815185/job/23182202059

Just try to fix a broken link found when reading https://intel.github.io/llvm-docs/EnvironmentVariables.html

…#12088) The description is no longer correct. Default values have changed. Ref: https://github.com/oneapi-src/unified-runtime/blob/main/source/common/umf_pools/disjoint_pool_config_parser.cpp#L27

uditagarwal97 and others added 29 commits March 27, 2024 16:12

[SYCL] Augment sycl-ls.test to increase code coverage. (intel#13165)

ca784f5

[SYCL][ESIMD][E2E] Remove setenv call from lsc_usm_atomic_cachehint.c…

caa6df8

…pp (intel#13171) All it's doing is setting doubleGRF, just do that using the first-class API. Manually tested this on Win. --------- Signed-off-by: Sarnie, Nick <[email protected]>

[SYCL][ESIMD][E2E] Re-enable fp_call_from_func.cpp (intel#13180)

13ea567

Tested manually on Win/Lin with many runs, doesn't hang anymore. Closes: intel#8815 Signed-off-by: Sarnie, Nick <[email protected]>

[SYCL][GRAPH] Fix minor Coverity performance issue (intel#13179)

d7bdb68

[SYCL][E2E] Switch some of bindless_images/* tests to use <sycl/detai…

db6a05d

…l/core.hpp> (intel#13131)

[UR] Refactor Device Initialisation (intel#12762)

f64a32a

oneapi-src/unified-runtime#1363

[SYCL] Fix error handling in non-blocking pipe operations (intel#13166)

a1c1e04

When a non-blocking pipe operation fails, CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST is expected. The runtime needs to handle that case instead of throwing the exception.

[CI] Fix bad OSSF recomendations (intel#13187)

bf93fbd

the OSSF tool sucks and don't use its recommended default settings. It suggested permissions content:read as default, but that broke most of our workflows, instead use the GitHub recommended permissions: read-all

[SYCL] persistent cache fix - directory creation and reporting improv…

f6e73e8

…ements (intel#13019) We have a report of persistent cache failures. Traced to the directory creation so I switched it to use C++17 std::filesystem routines for `OSUtil::makeDir`. Also improved trace reporting.

[ESIMD][NFC][DOC] Add 'restrictions' section to gather/scatter() doc (i…

2469975

…ntel#13196) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>

[SYCL][ESIMD] Remove no-fast-math-option from test (intel#13167)

7d77f84

[SYCL][NATIVECPU] Update OCK tag (intel#13188)

ba5feec

Updates the git tag for the oneAPI Construction Kit.

[ESIMD][NFC][DOC] Add 'restriction' section to atomic_update() doc (i…

d4045be

…ntel#13202) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>

[SYCL][NFC] Apply clang-format to bitreverse test (intel#13095)

d6e4a42

Apply clang-format to llvm.bitreverse lowering testcase --------- Signed-off-by: Lu, John <[email protected]>

[CI] Add IGC dev as new dependency (intel#13184)

2f03ef8

This is the 1st PR in prepare of enabling dev IGC test for some of the SYCL tests. Ref: intel#11552 Tested https://github.com/intel/llvm/actions/runs/8461815185/job/23182202059

[SYCL][Doc] Correct range-rounding link (intel#13139)

9bfb172

Just try to fix a broken link found when reading https://intel.github.io/llvm-docs/EnvironmentVariables.html

[SYCL][L0] Update SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR description (intel…

cefbadd

…#12088) The description is no longer correct. Default values have changed. Ref: https://github.com/oneapi-src/unified-runtime/blob/main/source/common/umf_pools/disjoint_pool_config_parser.cpp#L27

[SYCL][Graph] Implementation of whole graph update

6e98293

fabiomestre closed this Mar 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL][Graph] Implementation of whole graph update #365

[SYCL][Graph] Implementation of whole graph update #365

fabiomestre commented Mar 31, 2024

[SYCL][Graph] Implementation of whole graph update #365

[SYCL][Graph] Implementation of whole graph update #365

Conversation

fabiomestre commented Mar 31, 2024