Skip to content

Commit

Permalink
Add section to docs about false sharing
Browse files Browse the repository at this point in the history
  • Loading branch information
bennibolm committed Jan 30, 2024
1 parent f4e6e49 commit 193fa9f
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions docs/src/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,3 +267,13 @@ requires. It can thus be seen as a proxy for "energy used" and, as an extension,
timing result, you need to set the analysis interval such that the
`AnalysisCallback` is invoked at least once during the course of the simulation and
discard the first PID value.

## Performance issues due to false sharing
False sharing is a known performance issue for with distrubited caches. It also occured for

Check warning on line 272 in docs/src/performance.md

View workflow job for this annotation

GitHub Actions / Spell Check with Typos

"distrubited" should be "distributed".

Check warning on line 272 in docs/src/performance.md

View workflow job for this annotation

GitHub Actions / Spell Check with Typos

"occured" should be "occurred".
the implementation of a thread parallel bounds checking routine for the subcell IDP limiting
in [PR #1736](https://github.com/trixi-framework/Trixi.jl/pull/1736).
After some [experimentation and discussion](https://github.com/trixi-framework/Trixi.jl/pull/1736#discussion_r1423881895)
it turned out that initializing a vector of length `n * Threads.nthreads()` and only using every
n-th entry instead of a vector of length `Threads.nthreads()` fixes the problem.
Since there are no processors with caches over 128B, we use `n = 128B / size(uEltype)`.
Now, the bounds checking routine of the idp limiting scales as hoped.

0 comments on commit 193fa9f

Please sign in to comment.