Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e test for queue::fill with a range of pattern sizes #15991

Open
wants to merge 3 commits into
base: sycl
Choose a base branch
from

Conversation

rafbiels
Copy link
Contributor

@rafbiels rafbiels commented Nov 5, 2024

Update the UR tag to include oneapi-src/unified-runtime#2273 fixing queue::fill for the CUDA and HIP backends. It was previously producing incorrect outputs for any pattern size other than 1, 2, or a multiple of 4 bytes. A new optimisation is also added which speeds up the fill greatly if the pattern equals to the first word repeated throughout (e.g. all zeros). See the UR PR for more details.

The UR tag update was collected in #16040 so now this PR only adds an e2e test as stated below.

Add a new e2e test to validate queue::fill outputs for any pattern size between 1 and 32 bytes. This test fails for CUDA and HIP before the UR change and passes with this PR. Other backends already worked correctly.

sycl/test-e2e/USM/fill_any_size.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/USM/fill_any_size.cpp Outdated Show resolved Hide resolved
sycl/test-e2e/USM/fill_any_size.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@maarquitos14 maarquitos14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@callumfare
Copy link
Contributor

Hi @rafbiels - I'd like to merge the UR side of this soon. Could you rebase this PR and ensure the CI is run? If it passes I'll merge the UR PR and update the tag here.

@rafbiels
Copy link
Contributor Author

Hi @callumfare, I rebased both the UR side (onto the current tag used in intel/llvm) and here (onto HEAD of sycl branch) to keep the branches consistent. Feel free to launch the CI as I don't have the permission to do so.

@callumfare
Copy link
Contributor

@rafbiels Thanks, I ran the CI but it looks like the new test failed on Windows

@rafbiels
Copy link
Contributor Author

rafbiels commented Nov 15, 2024

@rafbiels Thanks, I ran the CI but it looks like the new test failed on Windows

Looks like this is the only workflow compiling LIT tests with -Werror and the warnings are just printf type format mismatch after updating variables to size_t (as requested in review). I'll push a fix shortly!

I'm not quite sure where this configuration difference comes from. Running llvm-lit with default config on Linux doesn't add -Werror, but that's for another discussion.

I'm wrong, it's much more trivial (and annoying) than that. Both platforms use -Werror but on Linux sizeof(unsigned long) == sizeof(size_t) (so no warning) but on Windows that's not the case! 🤦 Anyway, fix coming in shortly.

@rafbiels
Copy link
Contributor Author

The failure should be fixed now, please restart the CI when possible

@rafbiels
Copy link
Contributor Author

@callumfare looks like the UR change was collected into intel/llvm as part of #16040, but I would still like the new e2e test here to be merged. Shall I rebase and update this PR to just add the test now (no UR tag change)?

…e::fill

Update the UR tag to fix queue::fill for the CUDA and HIP backends, which was
previously producing incorrect outputs for any pattern size other than 1, 2,
or a multiple of 4 bytes. A new optimisation is also added which speeds up
the fill greatly if the pattern equals to the first word repeated throughout
(e.g. all zeros).

Add a new e2e test to validate queue::fill outputs for any pattern size
between 1 and 32 bytes. This test fails for CUDA and HIP before the UR change
and passes with this PR. Other backends already worked correctly.
@rafbiels rafbiels changed the title [UR][CUDA][HIP] Fix incorrect outputs and improve performance of queue::fill Add e2e test for queue::fill with a range of pattern sizes Nov 22, 2024
@rafbiels
Copy link
Contributor Author

@intel/llvm-gatekeepers could I have a re-run of the CI here and potentially a re-approval please?

The code hasn't changed since the last approval. The only difference is that this PR is no longer updating the unified-runtime tag because the corresponding UR change was already merged elsewhere. It's just the e2e test now.

@rafbiels
Copy link
Contributor Author

rafbiels commented Nov 29, 2024

I'm looking into the failure on 12th-gen CPU with OpenCL:
https://github.com/intel/llvm/actions/runs/11971631535/job/33695502538?pr=15991
At the moment this doesn't look to be an issue with the test itself, but rather a backend issue exposed by the new test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants