-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPUArraysCore? #409
Comments
A more lightweight interface package would be awesome! |
What makes GPUArrays heavyweight? I'd rather somebody with some package load-time optimization experience has a look rather than we introduce yet another package just for the sake of reducing load time. |
Currently, GPUArrays has a load time of about 1.3 s, which makes it a "heavy" package: julia> @time_imports import GPUArrays
17.1 ms ┌ Preferences
273.6 ms ┌ JLLWrappers
278.5 ms ┌ LLVMExtra_jll
3.4 ms ┌ CEnum
134.3 ms ┌ LLVM
0.6 ms ┌ Adapt
1292.1 ms GPUArrays Of that, about 400 ms is LLVM and 900 ms is GPUArrays itself (which is a lot for a single package, we have few that are that heavy): julia> @time_imports import LLVM, GPUArrays
17.5 ms ┌ Preferences
277.5 ms ┌ JLLWrappers
282.5 ms ┌ LLVMExtra_jll
3.4 ms ┌ CEnum
417.2 ms LLVM
0.5 ms ┌ Adapt
883.3 ms GPUArrays I may be pessimistic, but I don't think any kind of load time optimization can bring that down to a level that an interface package would have. LLVM alone may not be reducible in load time and GPUArrays provides a lot of array function specializations. |
Maybe an inversion of dependencies could also help? Looking at @ChrisRackauckas' list above: Now that we have a lightweight ArrayInterfaceCore.jl, maybe those specializations could be defined directly in GPUArrays.jl (which would then depend on ArrayInterfaceCore.jl)?
I've proposed adding some sliced-array interfaces to ArrayInterfaceCore.jl (JuliaLang/julia#32310 (comment)), I hope to have a draft with usage examples ready in a few days. Maybe if that interface would cover some array-of-arrays functionality in general, the code in RecursiveArrayTools.jl#L27-L29 could also move into GPUArrays.jl (which would seem a natural place). This is just ExponentialUtilities has a load time of about 120 ms on top of GPUArrays (about 1200 ms). It would make more sense for GPUArrays to depend on ExponentialUtilities than the other way round (current state). Maybe ExponentialUtilities itself could be lightened/optimized a bit? The This comes down to |
I realize that, I meant that it is not clear to me what functionality in GPUArrays inflates load time. Similarly, LLVM.jl shouldn't take 400ms to load, so I'd rather we look into that first before throwing dependencies out, because at some point (i.e. when you do want to use GPUs) you have to incur the load time anyway. |
But for us, that's the slim-ist minority of users. We have those bits of code in there for the 1% of people who ever want to try a GPU for it to work. So in almost no cases would that load time actually be incurred, it's only incurred because the code for checking GPU support of functions also requires that you load the entirely of LLVM.jl and the beginnings of the GPU stack. |
Sure, but the fact remains that LLVM.jl is intended to be a low-level/lightweight dependency that many packages can rely on (e.g. LoopVectorization should probably use it instead of manually mangling intrinsics and getting it wrong as it recently did), so it'd be more valuable to optimize its load time first. |
Splitting wouldn't take more than an hour or so and would be reversible in the future. That sounds like at least a year away. Instead of waiting much longer for that kind of intense fix, could a quick split be done to fix it now, and when LLVM.jl is lightweight return it? |
Yeah sure. I'm not opposed to a (temporary) split, I'd just like that the use case for GPU users is improved as well, and I don't have experience with load time latency optimization or the tools involved. So it'd be nice if somebody with such experience could have a look instead of papering over the immediate issue. |
Could there be a version that just defines the simplest stuff? GPUArrays.jl is too big to be a dependency everywhere, which makes Requires.jl usage a thing. We only need a few things in many places to get what's necessary:
https://github.com/JuliaArrays/ArrayInterface.jl/blob/master/lib/ArrayInterfaceGPUArrays/src/ArrayInterfaceGPUArrays.jl
https://github.com/SciML/RecursiveArrayTools.jl/blob/master/src/RecursiveArrayTools.jl#L27-L29
https://github.com/SciML/LinearSolve.jl/blob/8e31fd75dd10f478fca74fd46477f7b342590ab9/src/default.jl#L12
https://github.com/SciML/ExponentialUtilities.jl/blob/47f47b6f4ec95c3e88f94f2f41dd5f119f788f64/src/krylov_phiv.jl#L130
https://github.com/SciML/DiffEqNoiseProcess.jl/blob/03005350b6284fd8aaa8ca5374d990ea46bb3c80/src/wiener.jl#L19
https://github.com/SciML/DiffEqSensitivity.jl/blob/50fff85abca5452a90b155f8eb9833498d5b7aa4/src/concrete_solve.jl#L45
It would only need:
GPUArrays.AbstractGPUArray
GPUArrays.@allowscalar
And the entirety of SciML would then cut out the LLVM dependency and 1 second off every package. Some packages only have a 1 second using time, so it's almost entirely GPUArrays that's the issue now, and just for those two things.
The text was updated successfully, but these errors were encountered: