Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests and example for multi GPU usage #1909

Open
bernhardmgruber opened this issue Jan 25, 2023 · 2 comments
Open

Add tests and example for multi GPU usage #1909

bernhardmgruber opened this issue Jan 25, 2023 · 2 comments

Comments

@bernhardmgruber
Copy link
Member

bernhardmgruber commented Jan 25, 2023

Inspecting alpaka's implementations when thinking about zero-copying as part of #1820 I wondered whether alpaka actually supports copying buffers between two GPUs on e.g. the CUDA backend. Searching alpaka for API calls like cudaMemcpyPeer and cudaDeviceEnablePeerAccess only points me to the documentation here, which says that cudaDeviceEnablePeerAccess is "automatically done when required", but the API is never called inside the alpaka codebase. So I wondered, whether CUDA just does that automatically as part of cudaMemcpy when the source and destination are on different GPUs, and whether that is a feature of CUDA that is always present or requires some kind of minium CUDA version or compute architecture. Does anyone know?

Independently, we should have tests and also one example to show such a scenario. It also concerns all backends, not just CUDA.

@fwyzard
Copy link
Contributor

fwyzard commented Jan 25, 2023

By the way, cudaMemcpy and cudaMemcpyAsync are not able to copy memory allocated by cudaMallocAsync across different devices. The comment from NVIDIA is to use cudaMemcpyPeer or cudaMemcpyPeerAsync.

See https://github.com/fwyzard/nvidia_bug_3446335 for a reproducer.

@psychocoderHPC
Copy link
Member

I removed the explicit peer copies in the past #1400 because cudaMemcpy* is doing it automatically. There was no need to fiddle around with the peer copies anymore. Looks like I forgot to remove this in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants