Skip to content

Commit

Permalink
Add has_rocm for OpenMPI (#821)
Browse files Browse the repository at this point in the history
  • Loading branch information
avik-pal authored Jun 23, 2024
1 parent 5e6557d commit 71acbb7
Show file tree
Hide file tree
Showing 5 changed files with 64 additions and 5 deletions.
3 changes: 2 additions & 1 deletion docs/src/knownissues.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Make sure to:
```
- Then in Julia, upon loading MPI and CUDA modules, you can check
- CUDA version: `CUDA.versioninfo()`
- If MPI has CUDA: `MPI.has_cuda()`
- If MPI has CUDA: [`MPI.has_cuda()`](@ref)
- If you are using correct MPI library: `MPI.libmpi`
After that, it may be preferred to run the Julia MPI script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/11)) launching it from a shell script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/4)).
Expand All @@ -197,6 +197,7 @@ Make sure to:
```
- Then in Julia, upon loading MPI and CUDA modules, you can check
- AMDGPU version: `AMDGPU.versioninfo()`
- If MPI has ROCm: [`MPI.has_rocm()`](@ref)
- If you are using correct MPI implementation: `MPI.identify_implementation()`
After that, [this script](https://gist.github.com/luraess/c228ec08629737888a18c6a1e397643c) can be used to verify if ROCm-aware MPI is functional (modified after the CUDA-aware version from [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/11)). It may be preferred to run the Julia ROCm-aware MPI script launching it from a shell script (as suggested [here](https://discourse.julialang.org/t/cuda-aware-mpi-works-on-system-but-not-for-julia/75060/4)).
Expand Down
2 changes: 2 additions & 0 deletions docs/src/reference/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,7 @@ MPI.MPI_LIBRARY_VERSION_STRING
```@docs
MPI.versioninfo
MPI.has_cuda
MPI.has_rocm
MPI.has_gpu
MPI.identify_implementation
```
3 changes: 2 additions & 1 deletion docs/src/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,8 @@ should confirm your MPI implementation to have the ROCm support (AMDGPU) enabled
[alltoall\_test\_rocm\_multigpu.jl](https://gist.github.com/luraess/a47931d7fb668bd4348a2c730d5489f4) should confirm
your ROCm-aware MPI implementation to use multiple AMD GPUs (one GPU per rank).

The status of ROCm (AMDGPU) support cannot currently be queried.
If using OpenMPI, the status of ROCm support can be checked via the
[`MPI.has_rocm()`](@ref) function.

## Writing MPI tests

Expand Down
47 changes: 45 additions & 2 deletions src/environment.jl
Original file line number Diff line number Diff line change
Expand Up @@ -320,21 +320,23 @@ Wtime() = API.MPI_Wtime()
Check if the MPI implementation is known to have CUDA support. Currently only Open MPI
provides a mechanism to check, so it will return `false` with other implementations
(unless overriden).
(unless overriden). For "IBMSpectrumMPI" it will return `true`.
This can be overriden by setting the `JULIA_MPI_HAS_CUDA` environment variable to `true`
or `false`.
!!! note
For OpenMPI or OpenMPI-based implementations you first need to call [Init()](@ref).
See also [`MPI.has_rocm`](@ref) for ROCm support.
"""
function has_cuda()
flag = get(ENV, "JULIA_MPI_HAS_CUDA", nothing)
if flag === nothing
# Only Open MPI provides a function to check CUDA support
@static if MPI_LIBRARY == "OpenMPI"
# int MPIX_Query_cuda_support(void)
return 0 != ccall((:MPIX_Query_cuda_support, libmpi), Cint, ())
return @ccall libmpi.MPIX_Query_cuda_support()::Bool
elseif MPI_LIBRARY == "IBMSpectrumMPI"
return true
else
Expand All @@ -344,3 +346,44 @@ function has_cuda()
return parse(Bool, flag)
end
end

"""
MPI.has_rocm()
Check if the MPI implementation is known to have ROCm support. Currently only Open MPI
provides a mechanism to check, so it will return `false` with other implementations
(unless overriden).
This can be overriden by setting the `JULIA_MPI_HAS_ROCM` environment variable to `true`
or `false`.
See also [`MPI.has_cuda`](@ref) for CUDA support.
"""
function has_rocm()
flag = get(ENV, "JULIA_MPI_HAS_ROCM", nothing)
if flag === nothing
# Only Open MPI provides a function to check ROCm support
@static if MPI_LIBRARY == "OpenMPI" && MPI_LIBRARY_VERSION v"5"
# int MPIX_Query_rocm_support(void)
return @ccall libmpi.MPIX_Query_rocm_support()::Bool
else
return false
end
else
return parse(Bool, flag)
end
end

"""
MPI.has_gpu()
Checks if the MPI implementation is known to have GPU support. Currently this checks for the
following GPUs:
1. CUDA: via [`MPI.has_cuda`](@ref)
2. ROCm: via [`MPI.has_rocm`](@ref)
See also [`MPI.has_cuda`](@ref) and [`MPI.has_rocm`](@ref) for more fine-grained
checks.
"""
has_gpu() = has_cuda() || has_rocm()
14 changes: 13 additions & 1 deletion test/test_basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,22 @@ MPI.Init()

@test MPI.has_cuda() isa Bool

if get(ENV,"JULIA_MPI_TEST_ARRAYTYPE","") == "CuArray"
if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "CuArray"
@test MPI.has_cuda()
end

@test MPI.has_rocm() isa Bool

if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "ROCArray"
@test MPI.has_rocm()
end

@test MPI.has_gpu() isa Bool

if get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "CuArray" || get(ENV, "JULIA_MPI_TEST_ARRAYTYPE", "") == "ROCArray"
@test MPI.has_gpu()
end

@test !MPI.Finalized()
MPI.Finalize()
@test MPI.Finalized()

0 comments on commit 71acbb7

Please sign in to comment.