Skip to content

Commit

Permalink
Add Blog Post on Autotuning for Kokkos with APEX (#130)
Browse files Browse the repository at this point in the history
* Create blog-post-09.md: Kokkos-autotuning with APEX

* Update blog-post-09.md: updating with blog edits for brevity

* Update blog-post-09.md: shorten

* adding blog post auto-tuning plot

* add apex-logo for blog post 9

* Update blog-post-09.md

* Update blog-post-09.md

* Add files via upload

* Update blog-post-09.md - APEX - Kokkos Logo

* Putting in APEX-tuning image

* blog-post-09.md: fix plot name reference and the release date

* Add files via upload

* Add files via upload

* Update blog-post-09.md: fix filenames for images

* fixing images; delete unneeded

* blog-post-09.md: fixing corrections of wording given PR review from dalg24

* blog-post-09.md: change title to be more meaningful and long-lasting, as per suggestion in dalg24's PR review

* removing package-lock.json, which was added erroneously

* package-lock.json: ensuring no changes have been made to this file (copied from main branch on kokkos owner's repo).

* blog-post-09.md: fixing hyperreference for Wiki Post

* Update blog-post-09.md: make last sentence broader

Remove mention of Kokkos Tool APEX connector and replace with Auto-tuning capabilities. Note that the title spelling of Auto-Tuning should actually be Auto-tuning, and this is also fixed.

* Update content/blog/blog-post-09.md

Co-authored-by: Daniel Arndt <[email protected]>

* Update content/blog/blog-post-09.md

Co-authored-by: Daniel Arndt <[email protected]>

---------

Co-authored-by: Daniel Arndt <[email protected]>
  • Loading branch information
vlkale and masterleinad authored Jan 8, 2025
1 parent f804d49 commit f0d1863
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 0 deletions.
Binary file added assets/img/blog/2024-autotuning/APEX-tuning.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/img/blog/2024-autotuning/apex-kokkos.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions content/blog/blog-post-09.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
authors: ["kokkos-team"]
title: "Kokkos 4.5 Release Introduces New Auto-tuning Features"
date: 2024-12-11
tags: ["blog"]
thumbnail: img/blog/2024-autotuning/apex-kokkos.jpg
---

# Motivation

By default, internal Kokkos execution space parameters are empirically or heuristically hand-tuned with fixed parameter values to provide "one-size-fits-most" performance, with the goal of minimizing the effect of the abstraction overhead and approximating the performance of an optimized, lower-level backend implementation. Can these parameters be automatically tuned for a particular application and architecture so that programmers can easily tackle further performance opportunities? Auto-tuning capabilities for Kokkos programmers [5], released in Kokkos 4.5 [4], offer an answer.

# How it Works

Kokkos includes a Tuning API that can be used to construct a tuning context around a computational kernel, declare input variables that define the context state, declare output variables to be tuned, and request output variables when the kernel is executed. The Kokkos Tools infrastructure provide integrated support to utilize this API during Kokkos application execution, i.e., online, rather than offline [2]. Together, Kokkos Tools and the Tuning API is used in APEX to tune at runtime Kokkos kernel parameters running in any execution space / policy combination. We note through this Kokkos auto-tuning capability from APEX allows for (a) switching its tuning heuristics between Kokkos Execution Spaces (i.e. choose between Serial or OpenMP depending on the problem size, etc.) or execution policies and (b) auto-tuning any arbitrary parameter within an application that uses Kokkos - solver choices, algorithmic parameters, tolerances, etc.

# Outcomes

Our experiments have shown that in most cases the actively tuning case still performs faster than the default, untuned configuration despite the search exploration overhead. Figure 1 shows how the Kokkos Tools APEX auto-tuning connector adjusts the occupancy for a Kokkos parallel_for in a Kokkos benchmark [3] via APEX’s auto-tuning capabilities. From the figure, we see how the best-performing parameter value converges half-way through the Kokkos application’s execution. The figure below shows how Kokkos tuning parameter values converge over Kokkos Application Execution.

{{< image src="img/blog/2024-autotuning/APEX-tuning.png" style="float: center; height=10">}}

For an in-depth example on how to use the Kokkos Tools runtime auto-tuning API with the APEX performance measurement and runtime adaptation tool, see this [Wiki Post](https://github.com/UO-OACISS/apex/wiki/Kokkos-Runtime-Auto-Tuning-with-APEX).

The Kokkos team welcomes users to try the Kokkos Tools APEX auto-tuning capabilities and provide feedback given their auto-tuning needs. The Kokkos team is actively working on new features for auto-tuning, including providing a new flag for Kokkos executables, ML-guidance of auto-tuning, per-MPI process auto-tuning, and utilizing feedback from performance monitoring software such as LDMS.

# References

[1] Kokkos Tools library: [https://github.com/kokkos/kokkos-tools](https://github.com/kokkos/kokkos-tools)

[2] GPTune for Kokkos Albany: [https://linkinghub.elsevier.com/retrieve/pii/S0377042723001668](https://linkinghub.elsevier.com/retrieve/pii/S0377042723001668)

[3] Kokkos Occupancy Tuning Benchmark: [https://github.com/khuck/apex-kokkos-tuning/blob/main/tests/occupancy.cpp](https://github.com/khuck/apex-kokkos-tuning/blob/main/tests/occupancy.cpp)

[4] Kokkos 4.5 Release Briefing: [https://github.com/kokkos/kokkos-tutorials/blob/main/Other/ReleaseBriefings/release-45.pdf](https://github.com/kokkos/kokkos-tutorials/blob/main/Other/ReleaseBriefings/release-45.pdf)

[5] Autonomic Performance Environment for eXascale (APEX): [https://github.com/UO-OACISS/apex](https://github.com/UO-OACISS/apex)

0 comments on commit f0d1863

Please sign in to comment.