This is the bwap-lib module, a library for dynamic page placement in NUMA nodes.
The library will move pages between the NUMA nodes during your program's execution in order to speed it up.
A simple explanation of the library is as follows:
- Place application pages according to a weighted interleaving strategy by default (optimizing for bandwidth)
- Analyze the program's memory mapping via the
/proc/self/maps
file. This includes the .data and BSS segments, as well as dynamic memory mappings. - Use Hardware Performance Counters to monitor the average resource stall rate
- Move some pages from remote (non-worker) NUMA nodes into local (worker) NUMA nodes -- this reduces the bandwidth but also places pages closer to where they are requested, which may result in a performance increase if latency is the issue when accessing memory.
- If we see a drop in the average resource stall rate, we go back to step 4. A lower stall rate means the CPU is less time idle waiting for a resource. We assume that the loss in memory bandwidth is compensated with a lower access latency.
See the figure below for the architecture of BWAP: DWP tuner
represents the bwap-lib module
cmake
-- version 3.5 or newer- A modern C++ compiler
- We have used
gcc
8 during our testing gcc
from version 6 compiles the program, but binaries haven't been testedclang
from version 6 compiles the program, but binaries haven't been tested
- We have used
libnuma-dev
-- for thenuma.h
andlibnuma.h
headerslikwid
library for Performance monitoring: https://github.com/RRZE-HPC/likwidlibboost-all-dev
-- boost library
cmake .
to generate a Makefilemake
to build the library and tests
You can opt to use the library with or without modifying your program.
Preload the library to run alongside your program via LD_PRELOAD
:
LD_PRELOAD=/path/to/libunstickymem.so ./myProgram
- Include the library header in your program:
#include <unstickymem/unstickymem.h>
- Call at least one function (otherwise
gcc
won't bother to actually include it with your executable)
- See the available functions in
unstickymem/unstickymem.h
- Compile your program
You can make the library generally available to any user in the system.
make install
installs the library and required header files in your system.
Run make uninstall
to undo the effects of make install
.
There are a few options that can change the behavior of the library.
These are specified via environment variables. check them here: unstickymem.ini
. More information on this is coming soon
- We are using the
CMake
build system for this library. src
contains all source filesinclude
contains all header files. Each library uses its own subfolder in order to reduce collisions when installed in a system (following the Google C++ Style Guide).- A few example programs are included in the
test
subfolder.
- The higher-level logic is found in
unstickymem.cpp
. - The logic to view/parse/modify the process memory map is in
MemoryMap.cpp
&MemorySegment.cpp
. - The logic to deal with hardware performance counters is in
PerformanceCounters.cpp
- Utility functions to simplify page placement and migration are in
PagePlacement.cpp
If you found a bug or would like a feature added, please file a new Issue!
Check for issues with the
help-wanted
tag -- these are usually ideal as first
issues or where development has been hampered.
For more information and results see our original paper: Bandwidth-Aware Page Placement in NUMA (https://arxiv.org/abs/2003.03304), Accepted at 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2020