Skip to content

Commit

Permalink
adding a function to cleanup cached communicators to avoid mem leak
Browse files Browse the repository at this point in the history
  • Loading branch information
kab163 committed Feb 3, 2025
1 parent 02df2c1 commit 7341a97
Show file tree
Hide file tree
Showing 5 changed files with 38 additions and 2 deletions.
6 changes: 6 additions & 0 deletions .gitlab/jobs/lassen.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,12 @@ ibm_clang_14_0_5_mpi_shmem:
SPEC: "~shared +tools tests=basic +ipc_shmem +mpi %clang@=14.0.5.ibm.gcc.8.3.1 ^spectrum-mpi"
extends: .job_on_lassen

ibm_clang_14_0_5_mpi_shmem_memleak:
variables:
SPEC: "~shared +asan +tools tests=basic +ipc_shmem +mpi %clang@=14.0.5.ibm.gcc.8.3.1 cxxflags==-fsanitize=address ^spectrum-mpi"
ASAN_OPTIONS: "detect_leaks=1"
extends: .job_on_lassen

ibm_clang_14_0_5_mpi:
variables:
SPEC: "~shared +fortran +tools +mpi tests=basic %clang@=14.0.5.ibm.gcc.8.3.1 ^spectrum-mpi"
Expand Down
7 changes: 7 additions & 0 deletions docs/sphinx/cookbook/shared_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ which set it apart from other Umpire allocators.
2. If you want to see how much memory is available for a shared memory allocator, use the ``getActualSize()`` function.
3. File descriptors are used for the shared memory. These files will be under ``/dev/shm``.
4. Although Umpire does not need to have MPI enabled in order to provide IPC Shared Memory, if users wish to associate shared memory with MPI communicators, Umpire will need to be built with MPI enabled.
5. It most likely won't make sense to use memory pools with a shared memory allocator. The way shared memory allocators are implemented makes them already kind of pool-like. Since you have to give them a size when you create them, that is basically the "chunk" of memory you have to work with. Then, the shared memory allocator will manage that chunk for you. Therefore, we *do not* recommend that you use pools on top of shared memory allocators.

There are a few helper functions provided in the ``Umpire.hpp`` header that will be useful when working with
Shared Memory allocators. For example, you can grab the MPI communicator for a particular Shared Memory allocator with:
Expand All @@ -73,6 +74,12 @@ Shared Memory allocators. For example, you can grab the MPI communicator for a p
MPI_Comm shared_allocator_comm = umpire::get_communicator_for_allocator(node_allocator, MPI_COMM_WORLD);
Note that the ``node_allocator`` is the Shared Memory allocator we created above.

.. warning::
If you use the ``umpire::get_communicators_for_allocator(...)`` helper function then you MUST
also call ``umpire::cleanup_cached_communicators()`` function before you call ``MPI_Finalize()``
in order to avoid any memory leaks.

Additionally, we can double check that an allocator has the ``SHARED`` memory resource by asserting:

.. code-block:: cpp
Expand Down
1 change: 1 addition & 0 deletions examples/cookbook/recipe_shared_memory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ int main(int ac, char** av)
named_node_allocator.deallocate(ptr2);

if (use_mpi) {
umpire::cleanup_cached_communicators(); // Frees the shared_allocator_comm created above
MPI_Finalize();
}

Expand Down
19 changes: 17 additions & 2 deletions src/umpire/Umpire.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -255,10 +255,16 @@ void* find_pointer_from_name(Allocator allocator, const std::string& name)
}

#if defined(UMPIRE_ENABLE_MPI)
MPI_Comm get_communicator_for_allocator(Allocator a, MPI_Comm comm)
{
namespace {
static std::map<int, MPI_Comm> cached_communicators{};

std::map<int, MPI_Comm>& get_cached_communicators() {
return cached_communicators;
}
}

MPI_Comm get_communicator_for_allocator(Allocator a, MPI_Comm comm)
{
MPI_Comm c;
auto scope = a.getAllocationStrategy()->getTraits().scope;
int id = a.getId();
Expand All @@ -277,6 +283,15 @@ MPI_Comm get_communicator_for_allocator(Allocator a, MPI_Comm comm)

return c;
}

void cleanup_cached_communicators()
{
std::map<int, MPI_Comm>& comm = get_cached_communicators();

for(auto c : comm) {
MPI_Comm_free(&c.second);
}
}
#endif

void register_external_allocation(void* ptr, util::AllocationRecord record)
Expand Down
7 changes: 7 additions & 0 deletions src/umpire/Umpire.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,14 @@ umpire::MemoryResourceTraits get_default_resource_traits(const std::string& name
void* find_pointer_from_name(Allocator allocator, const std::string& name);

#if defined(UMPIRE_ENABLE_MPI)
/*!
* \brief Return the MPI communicator for a shared memory allocator.
*
* NOTE: Using this function will REQUIRE users to call the
* cleanup_cached_communicators() function to avoid memory leaks.
*/
MPI_Comm get_communicator_for_allocator(Allocator a, MPI_Comm comm);
void cleanup_cached_communicators();
#endif

void register_external_allocation(void* ptr, util::AllocationRecord record);
Expand Down

0 comments on commit 7341a97

Please sign in to comment.