Skip to content

Commit

Permalink
Cooperative groups: add a boundscheck to avoid confusing inexact erro…
Browse files Browse the repository at this point in the history
…rs. (#2631)

Selecting an inactive lane results in fns(mask, lane) returning -1.
  • Loading branch information
maleadt authored Jan 25, 2025
1 parent 159345f commit b2ee7e7
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions src/device/intrinsics/cooperative_groups.jl
Original file line number Diff line number Diff line change
Expand Up @@ -429,6 +429,7 @@ function shfl(cg::coalesced_group, elem, src_rank)
else
CUDA.fns(cg.mask, 0, src_rank) + 1i32
end
@boundscheck lane > 0 || throw(BoundsError(cg, src_rank))

shfl_sync(cg.mask, elem, lane)
end
Expand All @@ -439,6 +440,7 @@ function shfl_down(cg::coalesced_group, elem, delta)
end

lane = CUDA.fns(cg.mask, laneid() - 1i32, delta + 1i32) + 1i32
@boundscheck lane > 0 || throw(BoundsError(cg, laneid()+delta))
if lane > 32
lane = laneid()
end
Expand All @@ -452,6 +454,7 @@ function shfl_up(cg::coalesced_group, elem, delta)
end

lane = CUDA.fns(cg.mask, laneid() - 1i32, -(delta + 1i32)) + 1i32
@boundscheck lane > 0 || throw(BoundsError(cg, laneid()-delta))
if lane > 32
lane = laneid()
end
Expand Down

0 comments on commit b2ee7e7

Please sign in to comment.