-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add force range rounding option and introduce new compiler flag #12715
Conversation
CUDA should fail because of additional checks in |
This commit adds a new preference for range rounding, force, such that if the compile flag is used, only the range rounded parallel_for kernel will be generated. This can make binaries smaller as there is no duplication of SYCL range kernels across range rounded and unrounded versions. I have also added the flag: -fsycl-range-rounding, which can have values: on, force or disable. This flag aims to supercede the fsycl-disable-range-rounding flag. I have also added to existing tests to check for the functionality of the new flag and refactored the range rounding sycl-e2e test.
Add some description of how the -fsycl-range-rounding flag should be used.
- Change if else to switch in integration header emission and init preprocessor - Change comment in handler.hpp - Change comments and use static_assert with message in test-e2e - Change enum to have no defined int values - Wrap long line in LangOptions.def #
Makes -fsycl-range-rounding= accessible to driver calls, not just cc1 invocations.
Range rounding disable is tested in another test.
Range rounding is disabled for -O0 but now a user preference for range rounding can override this.
Make sure that if -fsycl-range-rounding=force is used, there is no emission of the unrounded range kernel at -O0 and -Od
d77cc74
to
d0b0d75
Compare
Ping @mdtoguchi for review, thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE Changes LGTM
Ping @intel/llvm-gatekeepers this can be merged |
Adds a new preference for range rounding, force, such that if the compile flag is used, only the range rounded parallel_for kernel will be generated. This can make binaries smaller as there is no duplication of SYCL range kernels across range rounded and unrounded versions.
I have also added the flag: -fsycl-range-rounding, which can have values: on, force or disable. This flag aims to supercede the fsycl-disable-range-rounding flag.
I have also added to existing tests to check for the functionality of the new flag and refactored the range rounding sycl-e2e test. Also added brief description of flag's behaviour in
doc