From d981d372b062bb9e561cb9f46fd7d0b697ba1b66 Mon Sep 17 00:00:00 2001 From: Victor Perez Date: Wed, 7 Feb 2024 10:09:53 +0000 Subject: [PATCH] [SYCL][Fusion][Doc] Document reductions support Document reduction strategies supported by kernel fusion and how users should use reductions in their code. Signed-off-by: Victor Perez --- sycl/doc/design/KernelFusionJIT.md | 34 +++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/sycl/doc/design/KernelFusionJIT.md b/sycl/doc/design/KernelFusionJIT.md index 224a1984d2902..8878e11ac6132 100644 --- a/sycl/doc/design/KernelFusionJIT.md +++ b/sycl/doc/design/KernelFusionJIT.md @@ -313,12 +313,44 @@ computing the private memory size. As range rounding only applies to basic kernels (parametrized by a `sycl::range`), local internalization is not affected by the range rounding transformation. +### Reductions + +Kernel fusion of reductions is partially supported. In order to preserve the +legality of the fused kernel, i.e., the fact that fused kernel must perform the +same work as the graph of kernels to be fused, only the fusion of following +reduction strategies at the time of writing is supported: + +- `group_reduce_and_last_wg_detection` +- `local_atomic_and_atomic_cross_wg` +- `range_basic` +- `group_reduce_and_atomic_cross_wg` +- `local_mem_tree_and_atomic_cross_wg` + +Other strategies require implicit inter-work-group synchronization, not +supported in kernel fusion. + +This way, users should not use `sycl::reduction` directly when performing kernel +fusion in their code, as an unsupported algorithm might be chosen. They should +instead use `sycl::detail::reduction_parallel_for`, forcing a supported fusion +strategy. Reductions implementation in +[`sycl/reduction.hpp`](../../include/sycl/reduction.hpp) might give users an +insight into which kind of reductions to use for their purposes: + +```c++ +q.submit([&](sycl::handler &cgh) { + sycl::accessor in(dataBuf, cgh, sycl::read_only); + sycl::reduction sum(sumBuf, cgh, sycl::plus<>{}); + // Force supported 'group_reduce_and_last_wg_detection' strategy + sycl::detail::reduction_parallel_for(...); +}); +``` + ### Unsupported SYCL constructs The following SYCL API constructs are currently not officially supported for kernel fusion and should be considered untested/unsupported: -- Reductions - `sycl::stream` - Specialization constants and `sycl::kernel_handler` - Images (`sycl::unsampled_image` and `sycl::sampled_image`)