Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage during GWDO stats estimation #1235

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

abishekg7
Copy link
Collaborator

A copy of #1228 with branch renamed. This PR attempts to improve the memory footprint during the computation of Gravity Wave Drag Orography statistics in the pre-processing step.

The previous approach relied on each MPI rank reading all of the topography and land use tiles, and hence would run out of memory before we could fully subscribe to all cores in a node. In the new approach, we only read in one tile at a time and when we encounter a pixel whose data is not already available in a linked list. The get_box subroutine is modified to call get_tile_from_box_point to check if the current pixel in the box is present in the linked list, and if not, it appends this tile (after reading both topo and land use data) to the head of the list.

This PR also changes box, box_landuse, dxm, nx and ny to be local variables, instead of module variables. This provides a little better readability, along with advantages of thread safety, etc.

@abishekg7
Copy link
Collaborator Author

Tested on Eris (GNU) with 24km grid, and Derecho (GNU) with 15km and 24km. Derecho NVHPC build and run is successful for 15km grid.

@abishekg7
Copy link
Collaborator Author

Tested on 5km grid as well.

Copy link
Contributor

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done some testing with a 120-km global mesh and all seems good. I'll do further testing with a variable-resolution regional mesh. Meanwhile, I've left a few comments that mostly concern style.

src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Show resolved Hide resolved
src/core_init_atmosphere/mpas_init_atm_gwd.F Show resolved Hide resolved
@abishekg7
Copy link
Collaborator Author

Thanks for the additional review @mgduda! I have addressed with the last two commits.

@mgduda mgduda self-requested a review December 19, 2024 00:16
Copy link
Contributor

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than an update to a routine name in a comment block (as noted), I think we're in good shape. After addressing the routine name issue, feel free to rework the commit history for this PR so that it has one or a few commits with a clear and distinct purpose.

Additionally -- though it can be put off to a future PR -- it could be worth looking for simple ways of optimizing the get_box and/or get_tile_from_box_point routines.

src/core_init_atmosphere/mpas_init_atm_gwd.F Outdated Show resolved Hide resolved
…on of GWDO statistics

This commit introduces code changes in order to improve the memory footprint of the
computation of Gravity Wave Drag Orography statistics in the init_atmosphere. The
previous approach involved each MPI rank reading in the entirety of the topography
and land use tiles, and this may prevent fully-subscribing to all cores in a node.
The new approach only reads in one tile at a time, when it encounters a pixel whose
data is not already available in a linked list. The new algorithm is able to fully
subscribe to all ranks in a node, leading to better parallel performance.

The get_box subroutine is modified to call get_tile_from_box_point to check if the
current pixel in the box is present in the linked list, and if not, it appends this
tile (after reading both topo and land use data) to the head of the list. This commit
also changes box, box_landuse, dxm, nx and ny to be local variables, instead of module
variables. This provides better readability, along with advantages of thread safety.

This commit also removes the extraneous conversion of the cell latitude from radians
to degrees and back to radians, prior to the estimation of zonal box size. The result
is a numerically more correct code, but it results in marginal differences with the
previous approach
…last matching tile

This commit introduces an optimization for the lookup of tiles in subroutine get_tile_from_box_point
by passing the most recent successful tile lookup to the next iteration of the search. These changes
substantially improves the single-core performance of the compute_gwd_fields subroutine, and also
improve the parallel performance of this computation.
@abishekg7 abishekg7 force-pushed the atmosphere/reduce_gwd_mem_usage branch from 650a60c to 7ad8bf6 Compare January 11, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants