Skip to content

Commit

Permalink
Remove redundant/incorrect nblock check
Browse files Browse the repository at this point in the history
  • Loading branch information
THargreaves committed Jan 7, 2025
1 parent 34a9c59 commit b994469
Showing 1 changed file with 1 addition and 4 deletions.
5 changes: 1 addition & 4 deletions lib/cublas/wrappers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1219,10 +1219,7 @@ end
offset = Base.elsize(strided) * stride

ptrs = CuArray{CuPtr{T}}(undef, batchsize)
nblocks = min(
cld(batchsize, 1024),
CUDA.attribute(device(strided), CUDA.DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK)
)
nblocks = cld(batchsize, 1024)
@cuda threads = 1024 blocks = nblocks create_ptrs_kernel!(ptrs, base_address, offset)
return ptrs
end
Expand Down

0 comments on commit b994469

Please sign in to comment.