-
Notifications
You must be signed in to change notification settings - Fork 21
Enable dpnp build on AMD GPU #2302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Array API standard conformance tests for dpnp=0.18.0dev1=py312he4f9c94_23 ran successfully. |
View rendered docs @ https://intelpython.github.io/dpnp/pull/2302/index.html |
scripts/build_locally.py
Outdated
if not arch: | ||
raise ValueError("--arch is required when --target=hip") | ||
cmake_args += [ | ||
"-DDPNP_TARGET_HIP=ON", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what do we need to define two variables? Can it be combined in a single one, like in dpctl: -DDPNP_TARGET_HIP={arch}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, --target=cuda
is current dpnp approach, but:
- dpctl and dpnp should consider supporting targeting specific CUDA architectures
--target=hip
means that there is no way to build simultaneously for HIP and CUDA (which is very, very much an edge case, but should be considered)
For these reasons, I think it is most sensible to move away from --target=
universal approach to --target-cuda=
and --target-hip=
or something to that effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndgrigorian it is a great suggestion.
I have added support for --target-hip
and I am going to add --target-cuda
instead of --target
in the next PR.
Thanks
doc/quick_start_guide.rst
Outdated
|
||
.. code-block:: bash | ||
|
||
python scripts/build_locally.py --target-hip=gfx90a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it might be unclear what gfx90a
means here. It'd be great to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
scripts/build_locally.py
Outdated
@@ -104,6 +105,15 @@ def run( | |||
# Always builds using oneMKL interfaces for the cuda target | |||
onemkl_interfaces = True | |||
|
|||
if target_hip is not None: | |||
if target_hip == "default": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a special handling for python scripts/build_locally.py --target-hip=
.
Now it is equal to python scripts/build_locally.py
, which was not intended, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment is applicable to python scripts/build_locally.py --target=
.
scripts/build_locally.py
Outdated
@@ -104,6 +105,15 @@ def run( | |||
# Always builds using oneMKL interfaces for the cuda target | |||
onemkl_interfaces = True | |||
|
|||
if target_hip is not None: | |||
if target_hip == "default": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit unclear what the use case assumed here? Is it about python scripts/build_locally.py --target-hip="default"
only?
Then I believe the error message below needs to be rephrase a bit to something like No default HIP architecture is supported. It must be specified explicitly
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the logic here by removing the check for default
@@ -75,27 +75,64 @@ option(DPNP_USE_ONEMKL_INTERFACES | |||
"Build DPNP with oneMKL Interfaces" | |||
OFF | |||
) | |||
set(HIP_TARGETS "" CACHE STRING "HIP architecture for target") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there is no support for multiple values:
set(HIP_TARGETS "" CACHE STRING "HIP architecture for target") | |
set(HIP_TARGET "" CACHE STRING "HIP architecture for target") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point, it was clear in docs that only one architecture was supported at a time, but now it isn't as clear and should be tested
Also, there is new information in the extension guide
The compiler driver also offers alias targets for each target+architecture pair to make the command line shorter and easier to understand for humans. Thanks to the aliases, the -Xsycl-target-backend flags no longer need to be specified.
It shows that the command
icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
-Xsycl-target-backend=spir64_gen '-device pvc' \
-Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
-Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
-o sycl-app sycl-app.cpp
is equivalent to
icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx1030,nvidia_gpu_sm_80 \
-o sycl-app sycl-app.cpp
so maybe both dpctl and dpnp can simplify by removing the need for -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=[X]
completely
list of aliases:
https://intel.github.io/llvm/UsersManual.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aliases list seems to claim only one alias is supported at a time. So probably only one architecture at once is possible? That would be my guess
CMakeLists.txt
Outdated
set(_dpnp_sycl_targets ${DPNP_SYCL_TARGETS}) | ||
set(_dpnp_sycl_targets ${DPNP_SYCL_TARGETS}) | ||
|
||
if (NOT "x${HIP_TARGETS}" STREQUAL "x") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that applicable only to HIP target? What is the use case? Should it be supported for CUDA target also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
if (NOT "x${HIP_TARGETS}" STREQUAL "x") | ||
set(_dpnp_amd_targets ${HIP_TARGETS}) | ||
set(_use_onemkl_interfaces_hip ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need here something similar to above?
set(_dpnp_sycl_targets "amdgcn-amd-amdhsa,${_dpnp_sycl_targets}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we set DPNP_SYCL_TARGETS via --cmake_opts
we expect them to be the right target e.g. amdgcn-amd-amdhsa
or nvptx64-nvidia-cuda
This PR updates
СMakeLists
files andbuild_locally.py
to enable building dpnp for AMD targets.To build dpnp on AMD:
To find the architecture, use