-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CUDA on Julia 1.9+ via a package extension. #32
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #32 +/- ##
==========================================
- Coverage 89.60% 88.18% -1.42%
==========================================
Files 3 4 +1
Lines 375 381 +6
==========================================
Hits 336 336
- Misses 39 45 +6
☔ View full report in Codecov by Sentry. |
The tests fail on Julia 1.9 and later because the test environment specifies CUDA 3 and the package extension has a CUDA 4, 5 compat. There are some different ways forward but it depends on the support ambition for Julia and CUDA versions.
|
I think we should just drop pre 1.9 julia support and also drop old CUDA.jl. |
a83257d
to
ab407a3
Compare
Done in second commit. I need to do some tests on my real use case and it needs README updates and a breaking version bump, either to 0.4 or 1.0. I would recommend taking the opportunity to jump to 1.0, regardless of package maturity, but that's your call. |
test/test_cuda_extension.jl
Outdated
# This file is not included from `runtests.jl` nor run in CI. | ||
# | ||
# Run it with `julia tests/test_cuda_extension.jl`. This requires that | ||
# Julia is installed with juliaup and will involve downloading of a | ||
# lot of big artifacts. The output will contain lots of error messages | ||
# from caught errors; what matters is that all testsets pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file looks pretty useful, thanks!
Thanks, 1.0 is a good idea! |
From my point of view this is good to go. The CI failure on nightly is a CUDA.jl issue; it doesn't load in isolation either. Oh, there's a |
In theory that should be automatic Line 6 in dfb67d6
Of course feel free to do any other additions to the docs, but it is not required from my side. |
Ok, then I'm happy with it as it is. |
I tried to play with your brach locally. I setup an environment, with ]dev ONNXRunTime
]add CUDA cuDNN
import CUDA; CUDA.set_runtime_version("v11.8")
exit() However when I do
And after that hits 99.1% it just starts again at 0% in an infinite loop. Any ideas? |
With CUDA runtime version
|
It's nothing I've run into but I'm trying it out in a clean depot. If it's reproducible I'm sure the Pkg developers would love a bug report. Update: Failed to reproduce the problem.
This is expected since you don't get v11 .so files with CUDA runtime v12, and ONNXRunTime tries to load v11. |
Could it be a network or package server issue? Kind of sounds like the artifact transmission gets interrupted and something decides to start over. Might be worth trying to switch to another package server or disable it, to see if something changes. Or just try again and hope it was a transient problem. |
Ok, let's make this stricter. The latest commit gives an error outside the officially supported range of CUDA runtime versions. It also centralizes the version data and adds consistency checks within the package. |
What’s the rush with 1.0? What would be reasonable, achievable goals for a first not-supposed-to-need-breaking changes version? |
Will restricting CUDA support to v11+ (via restrictions on CUDA.jl) rule out support for ONNXRuntime on Jetson Nano devices (that only support CUDA 10.2)? It might be that Nvidia has released a newer version with Jetson Orin Nano (at 5x the price), but I think dropping support for a quite popular runtime platform from a quite popular runtime package sounds like a bad idea. |
The main point of going to 1.0 is to get all three digits of semantic versioning and being able to distinguish between new features and bugfixes. Does the current version work with Jetson Nano? |
The issue disappeared, but now I get with
|
If I try to import it directly, I get:
CUDA itself seems to work with
|
My best guess is that your artifacts are messed up. I've seen similar things at work occasionally and then it has turned out that there is some empty artifact directory, possibly a result of some Pkg operation being interrupted at a bad time. Just removing the empty directory has solved the problem since Pkg in that case automatically downloads the missing artifact. A more brutal approach is to just delete all of A more sophisticated approach is to try to find the exact artifact where
I get
|
@GunnarFarneback Thanks a lot, my artifacts are now fixed and this works for me locally. |
The current version of ONNXRunTime.jl does not (I believe) run on the Jetson Nano, but ONNXRuntime does and the ONNXRuntime JLL should - that was a primary goal of JuliaPackaging/Yggdrasil#4386 (#12) and #19. All that has been left (for ages, I agree!) is to implement, I believe, a similar approach as the CUDARuntime JLL - using e.g. Preferences.jl to select among the execution providers (and hence the artifacts) at Pkg.add-time. Please feel free to comment e.g. in #19 or #12. I agree that compatibility with recent CUDA.jl versions should be an important goal for this package (I believe that is what this PR achieves…?). But it would be nice if there was still room to get the Jetson Nano etc. support completed. Preferable without having to go for a 2.x version of this package, I would say. So yes, I guess my main concern with this is bumping the version to 1.0. P.S. Inspired by this PR, I raced to the office to get the old Jetson Nano fired up and ready for testing - hoping to find some time soon for pitching in… |
The main point of this PR is to make the package play well with Julia 1.9+ precompilation and PrecompileTools, which is accomplished by replacing Requires with a package extension. Making it compatible with recent CUDA.jl is more of a bonus since it was easier to do that than to stay with the old version. There's no harm done in going to major version 2, or more, if it turns out that further breaking changes are needed. |
Cool! How about this? We merge this PR with |
The no-later-than-December plan sounds great. |
Have you considered using https://github.com/cjdoris/PackageExtensionCompat.jl to keep 1.6 compatibility? |
CUDA v4 would be compatible with 1.6, CUDA v5 needs julia 1.8. Personally I would keep it simple and just drop 1.6 compat, but if anyone wants to add it using PackageExtensionCompat I am also fine with that. |
e0ae641
to
0a99758
Compare
I did some git commit rewriting to change the version bump to 0.4. |
It's not entirely that simple since it doesn't help with the |
Co-authored-by: Jan Weidner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot this is a great PR! If there are no further objections, I will merge today.
Remember to register as well. I'd love to get back to a registered version instead of depending on this branch. |
This PR adds a package extension for CUDA on Julia 1.9+, with support for CUDA.jl versions 4 and 5.
It does not change anything for older Julia, where CUDA support remains implemented through theIt drops support for Julia < 1.9 and CUDA.jl < 4.Requires
package.The file
test/test_cuda_extension.jl
, not included fromruntests.jl
, tries to verify that the CUDA extension works for a variety of Julia/CUDA.jl/CUDA runtime versions. Possibly it should use a more demanding ONNX file, in case that affects the amount of needed libraries.Fixes #27.