Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack/Pool Allocator for GPU, only allocate on device #456

Open
MichaelSt98 opened this issue Dec 2, 2024 · 0 comments
Open

Stack/Pool Allocator for GPU, only allocate on device #456

MichaelSt98 opened this issue Dec 2, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@MichaelSt98
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

No response

Describe the solution you'd like

Device-only allocation for the pool allocator/stack.

Describe alternatives you've considered

No response

Additional context

Currently the stack is created and allocated on device like that:

REAL(KIND=JPRB), ALLOCATABLE :: ZSTACK(:, :)
ISTSZ = ...
ALLOCATE (ZSTACK(ISTSZ, NGPBLKS))
!$acc data create( ZSTACK )

...

!$acc end data
DEALLOCATE (ZSTACK)

thus the stack is both allocated on host and device, although we only require it to be on the device.

To only allocate it on device we could:

  • use declare device_resident
REAL(KIND=JPRB), ALLOCATABLE :: ZSTACK(:, :)
ISTSZ = ...
!$acc declare device_resident(ZSTACK)
allocate(ZSTACK(ISTZ, NGPBLKS))

...

deallocate(ZSTACK)

however, it is not clear whether NVIDIA adheres to the standard: StackOverflow: (NVHPC aka PGI) treats device_resident as a create?!

  • use CUDA Fortran REAL(KIND=JPRB), ALLOCATABLE, DEVICE :: ZSTACK(:, :)
    • and sacrifice portability?!
  • use OpenACC runtime function acc_malloc()
    • which complicates things as ZSTACK would need to be a c_ptr ....

Organisation

No response

@MichaelSt98 MichaelSt98 added the enhancement New feature or request label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant