diff --git a/Profiling.html b/Profiling.html new file mode 100644 index 000000000..7fcd91bc0 --- /dev/null +++ b/Profiling.html @@ -0,0 +1,206 @@ + + +
+ + + + +
+ IPPL (Independent Parallel Particle Layer)
+
+ IPPL
+ |
+
In certain applications, you might want to use profiling tools for debugging and testing. Since IPPL uses Kokkos as a backend, you can leverage Kokkos' built-in profiling tools.
+This guide explains how to use Kokkos' profiling tools, using the MemoryEvents tool as an example.
+MemoryEvents tracks a timeline of allocation and deallocation events in Kokkos Memory Spaces. It records time, pointer, size, memory-space-name, and allocation-name. This is in particular useful for debugging purposes to understand where all the memory is going.
+Additionally, the tool provides a timeline of memory usage for each individual Kokkos Memory Space.
+The tool is located at: https://github.com/kokkos/kokkos-tools/tree/develop/profiling/memory-events
+First, clone the Kokkos tools repository, which contains a variety of profiling tools:
Navigate into the repository and build the tools using CMake:
+Before running your application, export the Kokkos Tools environment variable to point to the kp_memory_events.so
tool:
Replace {PATH_TO_TOOL_DIRECTORY}
with the actual path where the tool is located.
Execute your application normally. The MemoryEvents tool will automatically collect data during execution. For example:
+The MemoryEvents tool will generate the following files:
+HOSTNAME-PROCESSID.mem_events:
Lists memory events.HOSTNAME-PROCESSID-MEMSPACE.memspace_usage:
Provides a utilization timeline for each active memory space.Here’s an example of how to run the profiling with a SLURM system using sbatch
:
In this example:
+sbatch -n 2
specifies 2 nodes.LandauDamping
application.This guide provides the basic steps for integrating Kokkos profiling tools into your IPPL-based projects. You can adjust the commands as needed depending on your specific application and environment.
+Consider the following code:
+This will produce the following output:
+HOSTNAME-PROCESSID.mem_events
+HOSTNAME-PROCESSID-Cuda.memspace_usage
+HOSTNAME-PROCESSID-CudaUVM.memspace_usage
+HOSTNAME-PROCESSID-CudaHostPinned.memspace_usage
+
+ IPPL (Independent Parallel Particle Layer)
+
+ IPPL
+ |
+