Skip to content

Commit

Permalink
Remove section in documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
bennibolm committed Mar 21, 2024
1 parent de66a75 commit bcaad2b
Showing 1 changed file with 0 additions and 11 deletions.
11 changes: 0 additions & 11 deletions docs/src/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,14 +282,3 @@ requires. It can thus be seen as a proxy for "energy used" and, as an extension,
timing result, you need to set the analysis interval such that the
`AnalysisCallback` is invoked at least once during the course of the simulation and
discard the first PID value.

## Performance issues with multi-threaded reductions
[False sharing](https://en.wikipedia.org/wiki/False_sharing) is a known performance issue
for systems with distributed caches. It also occurred for the implementation of a thread
parallel bounds checking routine for the subcell IDP limiting
in [PR #1736](https://github.com/trixi-framework/Trixi.jl/pull/1736).
After some [testing and discussion](https://github.com/trixi-framework/Trixi.jl/pull/1736#discussion_r1423881895),
it turned out that initializing a vector of length `n * Threads.nthreads()` and only using every
n-th entry instead of a vector of length `Threads.nthreads()` fixes the problem.
Since there are no processors with caches over 128B, we use `n = 128B / size(uEltype)`.
Now, the bounds checking routine of the IDP limiting scales as hoped.

0 comments on commit bcaad2b

Please sign in to comment.