Skip to content

Enable dpnp build on AMD GPU #2302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

Conversation

vlad-perevezentsev
Copy link
Collaborator

@vlad-perevezentsev vlad-perevezentsev commented Feb 10, 2025

This PR updates СMakeLists files and build_locally.py to enable building dpnp for AMD targets.

To build dpnp on AMD:

python scripts/build_locally.py --target-hip=gfx90a

To find the architecture, use

rocminfo | grep 'Name: *gfx.*'
  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you filing the PR as a draft?

Copy link
Contributor

github-actions bot commented Feb 10, 2025

Array API standard conformance tests for dpnp=0.18.0dev1=py312he4f9c94_23 ran successfully.
Passed: 1222
Failed: 0
Skipped: 9

Copy link
Contributor

View rendered docs @ https://intelpython.github.io/dpnp/pull/2302/index.html

if not arch:
raise ValueError("--arch is required when --target=hip")
cmake_args += [
"-DDPNP_TARGET_HIP=ON",
Copy link
Contributor

@antonwolfy antonwolfy Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what do we need to define two variables? Can it be combined in a single one, like in dpctl: -DDPNP_TARGET_HIP={arch}?

Copy link
Collaborator

@ndgrigorian ndgrigorian Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, --target=cuda is current dpnp approach, but:

  1. dpctl and dpnp should consider supporting targeting specific CUDA architectures
  2. --target=hip means that there is no way to build simultaneously for HIP and CUDA (which is very, very much an edge case, but should be considered)

For these reasons, I think it is most sensible to move away from --target= universal approach to --target-cuda= and --target-hip= or something to that effect

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ndgrigorian it is a great suggestion.
I have added support for --target-hip and I am going to add --target-cuda instead of --target in the next PR.
Thanks

@antonwolfy antonwolfy added this to the 0.18.0 release milestone Feb 26, 2025
@coveralls
Copy link
Collaborator

coveralls commented Mar 18, 2025

Coverage Status

coverage: 72.271%. remained the same
when pulling 5e2cc3d on enable_amd_build
into 2966ae6 on master.


.. code-block:: bash

python scripts/build_locally.py --target-hip=gfx90a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it might be unclear what gfx90a means here. It'd be great to clarify.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -104,6 +105,15 @@ def run(
# Always builds using oneMKL interfaces for the cuda target
onemkl_interfaces = True

if target_hip is not None:
if target_hip == "default":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a special handling for python scripts/build_locally.py --target-hip=.
Now it is equal to python scripts/build_locally.py, which was not intended, I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment is applicable to python scripts/build_locally.py --target=.

@@ -104,6 +105,15 @@ def run(
# Always builds using oneMKL interfaces for the cuda target
onemkl_interfaces = True

if target_hip is not None:
if target_hip == "default":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit unclear what the use case assumed here? Is it about python scripts/build_locally.py --target-hip="default" only?
Then I believe the error message below needs to be rephrase a bit to something like No default HIP architecture is supported. It must be specified explicitly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the logic here by removing the check for default

@@ -75,27 +75,64 @@ option(DPNP_USE_ONEMKL_INTERFACES
"Build DPNP with oneMKL Interfaces"
OFF
)
set(HIP_TARGETS "" CACHE STRING "HIP architecture for target")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there is no support for multiple values:

Suggested change
set(HIP_TARGETS "" CACHE STRING "HIP architecture for target")
set(HIP_TARGET "" CACHE STRING "HIP architecture for target")

Copy link
Collaborator

@ndgrigorian ndgrigorian Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, it was clear in docs that only one architecture was supported at a time, but now it isn't as clear and should be tested

Also, there is new information in the extension guide

The compiler driver also offers alias targets for each target+architecture pair to make the command line shorter and easier to understand for humans. Thanks to the aliases, the -Xsycl-target-backend flags no longer need to be specified.

It shows that the command

icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
        -Xsycl-target-backend=spir64_gen '-device pvc' \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
        -o sycl-app sycl-app.cpp

is equivalent to

icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx1030,nvidia_gpu_sm_80 \
        -o sycl-app sycl-app.cpp

so maybe both dpctl and dpnp can simplify by removing the need for -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=[X] completely

list of aliases:
https://intel.github.io/llvm/UsersManual.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliases list seems to claim only one alias is supported at a time. So probably only one architecture at once is possible? That would be my guess

CMakeLists.txt Outdated
set(_dpnp_sycl_targets ${DPNP_SYCL_TARGETS})
set(_dpnp_sycl_targets ${DPNP_SYCL_TARGETS})

if (NOT "x${HIP_TARGETS}" STREQUAL "x")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that applicable only to HIP target? What is the use case? Should it be supported for CUDA target also?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


if (NOT "x${HIP_TARGETS}" STREQUAL "x")
set(_dpnp_amd_targets ${HIP_TARGETS})
set(_use_onemkl_interfaces_hip ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need here something similar to above?

        set(_dpnp_sycl_targets "amdgcn-amd-amdhsa,${_dpnp_sycl_targets}")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we set DPNP_SYCL_TARGETS via --cmake_opts we expect them to be the right target e.g. amdgcn-amd-amdhsa or nvptx64-nvidia-cuda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants