You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Certain broadcast expressions that previously executed on the GPU (on Julia 1.9.3) and returned a CuArray are instead triggering scalar indexing warnings (on Julia 1.10.1) and returning an Array.
To reproduce
The Minimal Working Example (MWE) for this bug:
using CUDA
d_test = CUDA.ones(5)
getindex.(Ref(d_test), keys(d_test))
Expected behavior
Based on previous Julia versions, the MWE should produce a CuVector{Float32}:
Julia Version 1.10.1
Commit 7790d6f064* (2024-02-13 20:41 UTC)
Build Info:
Note: This is an unofficial build, please report bugs to the project
responsible for this build and not to the Julia project unless you can
reproduce the issue using official builds available at https://julialang.org/downloads
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: 24 × AMD Ryzen 9 3900X 12-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver2)
Threads: 1 default, 0 interactive, 1 GC (on 24 virtual cores)
This change in behavior broke some more complicated broadcast expressions (the MWE was reduced from one of these). For now, I am working around the issue by specifying a CuArray destination, like this:
d_result .= getindex.(Ref(d_test), keys(d_test))
(but that means figuring out the output type and dimensions first, which adds a step during development/prototyping)
Thanks!
The text was updated successfully, but these errors were encountered:
This was an deliberate change, see JuliaGPU/GPUArrays.jl#510 for the rationale.
It's too bad this trips up in your code, as I had hoped to sneak this in without having to tag a breaking release...
Thanks very much, makes sense. I like the clarity of the capture approach - it's easier to see the arguments that actually participate in broadcasting in a nontrivial way.
I'm updating my code, but in many cases all the "GPU-residing" objects are now captures. The MWE is such a case: keys(d_test) is (Base.OneTo(5),) so the naive fix wouldn't work:
function test()
d_test = CUDA.ones(5)
broadcast(keys(d_test)) do idx
d_test[idx]
end
end
This leads to a question I've been wanting to ask anyways:
Certain lightweight objects like OneTo(1000000) seem equally happy broadcasting on the host or the GPU (which is I think why cu(OneTo(1000000)) doesn't "move" anything to the device). Is there a way to opt into GPU execution? For broadcast! we can write
d_result .= foo.(OneTo(1000000))
For broadcast, is there anything easier than manually constructing a Broadcasted{CuArrayStyle} object?
For broadcast, is there anything easier than manually constructing a Broadcasted{CuArrayStyle} object?
I don't know of anything like that, but I agree it would be useful to override the broadcaststyle in a more ergonomic way. Maybe something to open an issue about upstream?
Describe the bug
Certain broadcast expressions that previously executed on the GPU (on Julia 1.9.3) and returned a CuArray are instead triggering scalar indexing warnings (on Julia 1.10.1) and returning an Array.
To reproduce
The Minimal Working Example (MWE) for this bug:
Expected behavior
Based on previous Julia versions, the MWE should produce a CuVector{Float32}:
Version info
Details on Julia:
Details on CUDA:
Additional context
On Julia 1.9.3,
Base.broadcasted(getindex, Ref(d_test), keys(d_test))
yields aBase.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Nothing, typeof(getindex), Tuple{Base.RefValue{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, LinearIndices{1, Tuple{Base.OneTo{Int64}}}}}
On Julia 1.10.1, the same expression yields a
Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(getindex), Tuple{Base.RefValue{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, LinearIndices{1, Tuple{Base.OneTo{Int64}}}}}
This change in behavior broke some more complicated broadcast expressions (the MWE was reduced from one of these). For now, I am working around the issue by specifying a CuArray destination, like this:
d_result .= getindex.(Ref(d_test), keys(d_test))
(but that means figuring out the output type and dimensions first, which adds a step during development/prototyping)
Thanks!
The text was updated successfully, but these errors were encountered: