Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GPU exection of atm_divergence_damping_3d via OpenACC #1237

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions src/core_atmosphere/dynamics/mpas_atm_time_integration.F
Original file line number Diff line number Diff line change
Expand Up @@ -2599,6 +2599,13 @@ subroutine atm_divergence_damping_3d( state, diag, mesh, configs, dts, edgeStart
rdts = 1.0_RKIND / dts
coef_divdamp = 2.0_RKIND * smdiv * config_len_disp * rdts

MPAS_ACC_TIMER_START('atm_divergence_damping_3d [ACC_data_xfer]')
!$acc enter data copyin(ru_p, rtheta_pp, rtheta_pp_old, specZoneMaskEdge, &
!$acc theta_m, cellsOnEdge)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specZoneMaskEdge and cellsOnEdge are invariant fields (the won't change after init). I believe your PR should make sure these variables are handled in atm_dynamics_init and shouldn't copyin or delete these fields in atm_divergence_damping_3d.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the convention we've been following so far? I assumed the routine to be ported has localized data movement clauses. I might also need to rework my other PRs if this is the case. Thanks for pointing it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to follow up.. Talked to @mgduda, and got this cleared up. Will make changes to these invariant fields. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the convention we've been following so far?

I've been following that pattern, it was set in #1176.

I've been doing things similarly to what may happen here - do all the data movement that is needed locally for the function being ported, and then separate out those invariant fields and handle them in the atm_dynamics_{init,finalize} routines. Typically these invariant fields are part of the mesh var_struct as described in the src/core_atmosphere/Registry.xml.

MPAS_ACC_TIMER_STOP('atm_divergence_damping_3d [ACC_data_xfer]')

!$acc parallel async
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async could be dangerous here, I don't believe acc exit data copyout(ru_p) will wait for this kernel to finish before beginning to copy.

I think it would be best to remove this async since we don't need it (yet) and it could create issues for the kernels that come next and depend on ru_p

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove the async. thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this async, but wanted to note that there's another one in atm_advance_scalars_work. Is that fine for now?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this async, but wanted to note that there's another one in atm_advance_scalars_work. Is that fine for now?

I think we're good to leave that async clause for now, it has a matching wait directive/clause later on.

!$acc loop gang worker
do iEdge=edgeStart,edgeEnd ! MGD do we really just need edges touching owned cells?

cell1 = cellsOnEdge(1,iEdge)
Expand All @@ -2608,6 +2615,7 @@ subroutine atm_divergence_damping_3d( state, diag, mesh, configs, dts, edgeStart
if (cell1 <= nCellsSolve .or. cell2 <= nCellsSolve ) then

!DIR$ IVDEP
!$acc loop vector
do k=1,nVertLevels

!! unscaled 3d divergence damping
Expand All @@ -2625,6 +2633,13 @@ subroutine atm_divergence_damping_3d( state, diag, mesh, configs, dts, edgeStart
end do
end if ! edges for block-owned cells
end do ! end loop over edges
!$acc end parallel

MPAS_ACC_TIMER_START('atm_divergence_damping_3d [ACC_data_xfer]')
!$acc exit data copyout(ru_p) delete(rtheta_pp, rtheta_pp_old, &
!$acc specZoneMaskEdge, theta_m, cellsOnEdge)
MPAS_ACC_TIMER_STOP('atm_divergence_damping_3d [ACC_data_xfer]')


end subroutine atm_divergence_damping_3d

Expand Down