Skip to content

Version 2.14.03

Compare
Choose a tag to compare
@chuckyount chuckyount released this 27 Sep 03:53
· 1075 commits to master since this release
245e61b

Adds mini-block hierarchy level below blocks and above sub-blocks.
Separates unit-of-work for OpenMP threads and cache-block size:

  • Blocks, as before, are units-of-work for top-level OpenMP threads. Blocks are evaluated in parallel in each region.
  • Mini-blocks are evaluated sequentially within each block and are typically sized for L2 caches.
    By default, mini-blocks are the same size as blocks, so most users will see no difference.
    It is possible to apply temporal blocking to both blocks and mini-blocks. Using '-bt' will set both by default.

Also removes loop-grouping parameters because they have not shown performance gains and are confusing to users.