forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Graph] Implementation of whole graph update #365
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…pp (intel#13171) All it's doing is setting doubleGRF, just do that using the first-class API. Manually tested this on Win. --------- Signed-off-by: Sarnie, Nick <[email protected]>
Tested manually on Win/Lin with many runs, doesn't hang anymore. Closes: intel#8815 Signed-off-by: Sarnie, Nick <[email protected]>
per OSSF (https://securityscorecards.dev/viewer/?uri=github.com/intel/llvm) all workflows should have default top level permission set. Which we set to below as per recommendation permissions: contents: read then within actual jobs, when needed, we added additional privileges. These changes were generated by the recommended OSSF tool This PR changes those workflows created/owned by intel/llvm repo. Will do seperate PR for issues found in llvm/llvm-project inherited workflows.
…ported on Native CPU (intel#13109) Similarly to what is done for `nvptx` in intel#13015, Native CPU maps `private` and `generic` to the same address spaces, so we need to avoid getting multiple definitions for the libclc builtins that use `generic`.
Previously we were hard-coding an -O2 optimization level for the 'signbit' builtin for all versions of GCC. Despite this workaround, I found locally that I was unable to build with GCC versions 12.2, 12.3, and 13.2. Reducing the optimization level to -O1 allowed me to progress. This seems to follow the bug report already linked, which had test cases at -O2 which were also failing. With this in mind, we can also restrict the GCC versions we apply the workaround to, so that more modern compilers should "just work" without us having to do anything. That should save someone having to investigate a performance report a year or so down the line...
This commit fixes the problem of missing build dependencies between libclc source files and their various includes. We would like to do this with compiler-generated dependency files because then the dependencies are accurate and there are no false positives, leading to unnecessary rebuilds. This is how regular C/C++ dependencies are usually tracked by CMake. Note that this variable is an internal API so is not guaranteed to work, but then again *all* of CMake's support for new languages (which we use for CLC/LL languages) is an internal API. On balance this change is probably worth it due to how minimally invasive it is. The alternative would be to either: 1. list/glob all possible files in the directory as dependencies, which would lead to false positives. 2. rewrite the library generation as a loop over all files and calling `add_custom_command`, which can produce a dependency file (by tweaking our clang command line) that can also be fed back to the same command via the `DEPFILE` argument. This would be a much larger change and is not as "neat".
When a non-blocking pipe operation fails, CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST is expected. The runtime needs to handle that case instead of throwing the exception.
the OSSF tool sucks and don't use its recommended default settings. It suggested permissions content:read as default, but that broke most of our workflows, instead use the GitHub recommended permissions: read-all
…13045) XPTI has unit tests that time the cost of each individual framework action, but an E2E timing test isn't available. This PR adds a new sample collector that shows how data can be pulled from the SYCL runtime using XPTI and provides timing information for the callback handler costs/event. Allows: 1. Zero cost application with XPTI_TRACE_ENABLE=0 2. Zero cost callback handlers when run in calibration mode 3. Full E2E test when run with "--format none" which gives the average cost of callback handlers/event --------- Signed-off-by: Vasanth Tovinkere <[email protected]>
…ements (intel#13019) We have a report of persistent cache failures. Traced to the directory creation so I switched it to use C++17 std::filesystem routines for `OSUtil::makeDir`. Also improved trace reporting.
…ntel#13196) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>
…oading (intel#13083) Based on discussions with various stakeholders, we concluded that spirv32/spirv64 are the best-suited strings for target architectures when user wants to generate JIT code for Intel backends using DPCPP compiler. This PR adds changes to allow the DPCPP compiler to accept spirv32/spirv64 as valid target architecture strings. spir/spir64 are also valid target architecture strings, but will be deprecated in a future commit. This change will help us to align with the SPIR-V backend behavior and ensure smoother SYCL upstreaming. Currently, only JIT triples using spirv32/spirv64 are supported. AOT triples using spirv32/spirv64 will be added soon. Thanks --------- Signed-off-by: Sudarsanam, Arvind <[email protected]>
Updates the git tag for the oneAPI Construction Kit.
Replace check for cv-unqualified object types with a check for cv-unqualified trivial types to be in line with the `sycl_ext_oneapi_private_alloca` extension specification: > `ElementType` must be a cv-unqualified trivial type --------- Signed-off-by: Victor Perez <[email protected]>
…ntel#13202) Signed-off-by: Klochkov, Vyacheslav N <[email protected]>
Implementing the get_backend_info() functions for our SYCL implementation based on SYCL 2020 spec. (Link here: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html you may search for "get_backend_info()" there for the spec for these functions) There're six groups of variations for this function, namely `sycl::platform::get_backend_info()`, `sycl::context::get_backend_info()`, `sycl::device::get_backend_info()`, `sycl::queue::get_backend_info()`, `sycl::eventv::get_backend_info()`, and `sycl::kernel::get_backend_info()` One known concern: it seems that sycl::platform, sycl::context and sycl::kernel may have multiple associated device, but according to the spec the return type for `sycl::xxx::get_backend_info<info::device::version>()` should be std::string (i.e. a single device version) so I'm just returning the version of the first associated device in the list. Is this OK? --------- Signed-off-by: Hu, Peisen <[email protected]>
* Update the test to initialize the input vectors with 0s to match `bindless_helpers::fill_rand` requirement of non empty vector. * Change the name of function `initVector` to `init_vector`. * move `init_vector`, `equal_vec` and `operator<<` in header `bindless_helpers.hpp`.
…able to return result into a different matrix (intel#13151) Currently, CUDA code that use this pattern: for (int i = 0; i < c_frag.num_elements; i++) { c_frag.x[i] = alpha * acc_frag.x[i] + beta * c_frag.x[i]; } cannot be migrated to SYCL joint matrix. This added overload addresses this. Spec API is added here intel#13153
After intel@370aa2a grf_size control values changed to 128 and 256 values instead of values like "small", "large". > 2) Adds two new kernel properties > `sycl::ext::intel::experimental::grf_size` and > `sycl::ext::intel::experimental::grf_size_automatic`, as per the spec. > `grf_size` adds the `sycl-grf-size` metadata with a value of the > template parameter **(`128` or `256`)**. `grf_size_automatic` adds the > `sycl-grf-size` metadata with a value of `0`. and user is expected to specify value like this: syclex::properties kernel_properties{intelex::grf_size<128>}; syclex::properties kernel_properties{intelex::grf_size<256>};
Apply clang-format to llvm.bitreverse lowering testcase --------- Signed-off-by: Lu, John <[email protected]>
This is the 1st PR in prepare of enabling dev IGC test for some of the SYCL tests. Ref: intel#11552 Tested https://github.com/intel/llvm/actions/runs/8461815185/job/23182202059
Just try to fix a broken link found when reading https://intel.github.io/llvm-docs/EnvironmentVariables.html
…#12088) The description is no longer correct. Default values have changed. Ref: https://github.com/oneapi-src/unified-runtime/blob/main/source/common/umf_pools/disjoint_pool_config_parser.cpp#L27
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.