Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Jetson Nano and TX2 NX (CUDA 10.2) #2147

Open
stemann opened this issue Nov 3, 2023 · 8 comments
Open

Support for Jetson Nano and TX2 NX (CUDA 10.2) #2147

stemann opened this issue Nov 3, 2023 · 8 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@stemann
Copy link

stemann commented Nov 3, 2023

WIP:

Miscellaneous notes
Non-bug report regarding running tests on an unreasonable configuration:

This report is not completely fair as it relates to the last CUDA.jl version to currently support CUDA 10.2, v4.0.1 - on aarch64-linux-gnu (a Jetson Nano) - on Julia v1.10.0-beta3.

Describe the bug

Testing CUDA.jl v4.0.1 on aarch64-linux-gnu fails with stack:

ERROR: LoadError: AssertionError: llvmtype(decl) == llvmtype(entry)
Stacktrace:
  [1] emit_function!(mod::LLVM.Module, job::GPUCompiler.CompilerJob, f::Type, method::GPUCompiler.Runtime.RuntimeMethodInstance; ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:91
  [2] build_runtime(job::GPUCompiler.CompilerJob; ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:113
  [3] build_runtime
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:98 [inlined]
  [4] (::GPUCompiler.var"#95#98"{LLVM.ThreadSafeContext, GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}}})()
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:167
  [5] lock(f::GPUCompiler.var"#95#98"{LLVM.ThreadSafeContext, GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}}}, l::ReentrantLock)
    @ Base ./lock.jl:229
  [6] macro expansion
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:127 [inlined]
  [7] load_runtime(job::GPUCompiler.CompilerJob; ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:83
  [8] load_runtime
    @ CUDA ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:77 [inlined]
  [9] (::CUDA.var"#123#125"{Vector{VersionNumber}, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}})(ctx::LLVM.ThreadSafeContext)
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:21
 [10] LLVM.ThreadSafeContext(f::CUDA.var"#123#125"{Vector{VersionNumber}, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}})
    @ LLVM ~/.julia/packages/LLVM/HykgZ/src/executionengine/ts_module.jl:14
 [11] JuliaContext
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:74 [inlined]
 [12] precompile_runtime
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:15 [inlined]
 [13] precompile_runtime()
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:13
 [14] top-level scope
    @ ~/.julia/packages/CUDA/ZdCxS/test/setup.jl:32

GPUCompiler is at version 0.17.3.

GPUCompiler tests
Testing GPUCompiler fails on the same assertion.

Testing GPUCompiler (at version 0.25.0) fails in a different way (see first comment).

To reproduce

The Minimal Working Example (MWE) for this bug:

using Pkg
Pkg.add(; name="CUDA", version=v"4.0.1")
Pkg.test("CUDA")
Manifest.toml

# This file is machine-generated - editing it directly is not advised

julia_version = "1.10.0-beta3"
manifest_format = "2.0"
project_hash = "3509c5bf235fb7c0326b865a545a502f318a7ac8"

[[deps.AbstractFFTs]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "d92ad398961a3ed262d8bf04a1a2b8340f915fef"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.5.0"

    [deps.AbstractFFTs.extensions]
    AbstractFFTsChainRulesCoreExt = "ChainRulesCore"
    AbstractFFTsTestExt = "Test"

    [deps.AbstractFFTs.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[deps.Adapt]]
deps = ["LinearAlgebra", "Requires"]
git-tree-sha1 = "02f731463748db57cc2ebfbd9fbc9ce8280d3433"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "3.7.1"

    [deps.Adapt.extensions]
    AdaptStaticArraysExt = "StaticArrays"

    [deps.Adapt.weakdeps]
    StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"

[[deps.ArgTools]]
uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
version = "1.1.1"

[[deps.Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"

[[deps.BFloat16s]]
deps = ["LinearAlgebra", "Printf", "Random", "Test"]
git-tree-sha1 = "dbf84058d0a8cbbadee18d25cf606934b22d7c66"
uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b"
version = "0.4.2"

[[deps.Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[deps.CEnum]]
git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90"
uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
version = "0.4.2"

[[deps.CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Preferences", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions"]
git-tree-sha1 = "edff14c60784c8f7191a62a23b15a421185bc8a8"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "4.0.1"

[[deps.CUDA_Driver_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"]
git-tree-sha1 = "75d7896d1ec079ef10d3aee8f3668c11354c03a1"
uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc"
version = "0.2.0+0"

[[deps.CUDA_Runtime_Discovery]]
deps = ["Libdl"]
git-tree-sha1 = "d6b227a1cfa63ae89cb969157c6789e36b7c9624"
uuid = "1af6417a-86b4-443c-805f-a4643ffb695f"
version = "0.1.2"

[[deps.CUDA_Runtime_jll]]
deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "ed00f777d2454c45f5f49634ed0a589da07ee0b0"
uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
version = "0.2.4+1"

[[deps.CompilerSupportLibraries_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
version = "1.0.5+1"

[[deps.Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[deps.DocStringExtensions]]
deps = ["LibGit2"]
git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d"
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
version = "0.9.3"

[[deps.Downloads]]
deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"]
uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
version = "1.6.0"

[[deps.ExprTools]]
git-tree-sha1 = "27415f162e6028e81c72b82ef756bf321213b6ec"
uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
version = "0.1.10"

[[deps.FileWatching]]
uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"

[[deps.GPUArrays]]
deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"]
git-tree-sha1 = "2e57b4a4f9cc15e85a24d603256fe08e527f48d1"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "8.8.1"

[[deps.GPUArraysCore]]
deps = ["Adapt"]
git-tree-sha1 = "2d6ca471a6c7b536127afccfa7564b5b39227fe0"
uuid = "46192b85-c4d5-4398-a991-12ede77f4527"
version = "0.1.5"

[[deps.GPUCompiler]]
deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "19d693666a304e8c371798f4900f7435558c7cde"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.17.3"

[[deps.InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"

[[deps.IrrationalConstants]]
git-tree-sha1 = "630b497eafcc20001bba38a4651b327dcfc491d2"
uuid = "92d709cd-6900-40b7-9082-c6be49f344b6"
version = "0.2.2"

[[deps.JLLWrappers]]
deps = ["Artifacts", "Preferences"]
git-tree-sha1 = "7e5d6779a1e09a36db2a7b6cff50942a0a7d0fca"
uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
version = "1.5.0"

[[deps.LLVM]]
deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"]
git-tree-sha1 = "f044a2796a9e18e0531b9b3072b0019a61f264bc"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "4.17.1"

[[deps.LLVMExtra_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "070e4b5b65827f82c16ae0916376cb47377aa1b5"
uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab"
version = "0.0.18+0"

[[deps.LazyArtifacts]]
deps = ["Artifacts", "Pkg"]
uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"

[[deps.LibCURL]]
deps = ["LibCURL_jll", "MozillaCACerts_jll"]
uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
version = "0.6.4"

[[deps.LibCURL_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
version = "8.0.1+1"

[[deps.LibGit2]]
deps = ["Base64", "LibGit2_jll", "NetworkOptions", "Printf", "SHA"]
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"

[[deps.LibGit2_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll"]
uuid = "e37daf67-58a4-590a-8e99-b0245dd2ffc5"
version = "1.6.4+0"

[[deps.LibSSH2_jll]]
deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
version = "1.11.0+1"

[[deps.Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[deps.LinearAlgebra]]
deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

[[deps.LogExpFunctions]]
deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"]
git-tree-sha1 = "7d6dd4e9212aebaeed356de34ccf262a3cd415aa"
uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
version = "0.3.26"

    [deps.LogExpFunctions.extensions]
    LogExpFunctionsChainRulesCoreExt = "ChainRulesCore"
    LogExpFunctionsChangesOfVariablesExt = "ChangesOfVariables"
    LogExpFunctionsInverseFunctionsExt = "InverseFunctions"

    [deps.LogExpFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0"
    InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112"

[[deps.Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"

[[deps.Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"

[[deps.MbedTLS_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
version = "2.28.2+1"

[[deps.MozillaCACerts_jll]]
uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
version = "2023.1.10"

[[deps.NetworkOptions]]
uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
version = "1.2.0"

[[deps.OpenBLAS_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"]
uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
version = "0.3.23+2"

[[deps.OpenLibm_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "05823500-19ac-5b8b-9628-191a04bc5112"
version = "0.8.1+2"

[[deps.OpenSpecFun_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1"
uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
version = "0.5.5+0"

[[deps.Pkg]]
deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"]
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
version = "1.10.0"

[[deps.Preferences]]
deps = ["TOML"]
git-tree-sha1 = "00805cd429dcb4870060ff49ef443486c262e38e"
uuid = "21216c6a-2e73-6563-6e65-726566657250"
version = "1.4.1"

[[deps.Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"

[[deps.REPL]]
deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"

[[deps.Random]]
deps = ["SHA"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[[deps.Random123]]
deps = ["Random", "RandomNumbers"]
git-tree-sha1 = "552f30e847641591ba3f39fd1bed559b9deb0ef3"
uuid = "74087812-796a-5b5d-8853-05524746bad3"
version = "1.6.1"

[[deps.RandomNumbers]]
deps = ["Random", "Requires"]
git-tree-sha1 = "043da614cc7e95c703498a491e2c21f58a2b8111"
uuid = "e6cf234a-135c-5ec9-84dd-332b85af5143"
version = "1.5.3"

[[deps.Reexport]]
git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
uuid = "189a3867-3050-52da-a836-e630ba90ab69"
version = "1.2.2"

[[deps.Requires]]
deps = ["UUIDs"]
git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7"
uuid = "ae029012-a4dd-5104-9daa-d747884805df"
version = "1.3.0"

[[deps.SHA]]
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.7.0"

[[deps.Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[[deps.Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"

[[deps.SparseArrays]]
deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"]
uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
version = "1.10.0"

[[deps.SpecialFunctions]]
deps = ["IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"]
git-tree-sha1 = "e2cfc4012a19088254b3950b85c3c1d8882d864d"
uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
version = "2.3.1"

    [deps.SpecialFunctions.extensions]
    SpecialFunctionsChainRulesCoreExt = "ChainRulesCore"

    [deps.SpecialFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"

[[deps.Statistics]]
deps = ["LinearAlgebra", "SparseArrays"]
uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
version = "1.10.0"

[[deps.SuiteSparse_jll]]
deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"]
uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c"
version = "7.2.0+1"

[[deps.TOML]]
deps = ["Dates"]
uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
version = "1.0.3"

[[deps.Tar]]
deps = ["ArgTools", "SHA"]
uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
version = "1.10.0"

[[deps.Test]]
deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[deps.TimerOutputs]]
deps = ["ExprTools", "Printf"]
git-tree-sha1 = "f548a9e9c490030e545f72074a41edfd0e5bcdd7"
uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
version = "0.5.23"

[[deps.UUIDs]]
deps = ["Random", "SHA"]
uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"

[[deps.Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

[[deps.Zlib_jll]]
deps = ["Libdl"]
uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
version = "1.2.13+1"

[[deps.libblastrampoline_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
version = "5.8.0+1"

[[deps.nghttp2_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
version = "1.52.0+1"

[[deps.p7zip_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
version = "17.4.0+2"

Expected behavior

Tests pass.

Version info

Details on Julia:

julia --threads=auto --eval 'using InteractiveUtils; versioninfo()'
Julia Version 1.10.0-beta3
Commit 404750f8586 (2023-10-03 12:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (aarch64-linux-gnu)
  CPU: 4 × ARMv8 Processor rev 1 (v8l)
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cortex-a57)
  Threads: 5 on 4 virtual cores

Details on CUDA:

CUDA runtime 10.2, artifact installation
CUDA driver 10.2
Unknown NVIDIA driver

Libraries: 
- CUBLAS: 10.2.2
- CURAND: 10.1.2
- CUFFT: 10.1.2
- CUSOLVER: 10.3.0
- CUSPARSE: 10.3.1
- CUPTI: 12.0.0
- NVML: missing

Toolchain:
- Julia: 1.10.0-beta3
- LLVM: 15.0.7
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5
- Device capability support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75

1 device:
  0: NVIDIA Tegra X1 (sm_53, 42.234 MiB / 1.933 GiB available)

Additional context

Add any other context about the problem here.

@stemann stemann added the bug Something isn't working label Nov 3, 2023
@stemann
Copy link
Author

stemann commented Nov 3, 2023

Testing GPUCompiler (at version 0.25.0) fails in a different way:

[deleted]

@maleadt
Copy link
Member

maleadt commented Nov 3, 2023

The GPUCompiler failure is a test timeout, and doesn't point to an actual issue.

GPUCompiler is at version 0.17.3.

That's your problem. You're using a beta release of Julia, so you should expect to have to use the latest versions of GPUCompiler.jl and friends for compatibility.
It's otherwise unrelated to CUDA.

@stemann
Copy link
Author

stemann commented Nov 3, 2023

Right - I was also mostly filing this issue to make a note of the status - if it was relevant to bringing CUDA 10.2 support to CUDA.jl v5.

@stemann
Copy link
Author

stemann commented Nov 3, 2023

On a side note - running with the local toolkit fails to find compute-sanitizer (because it's not there):

ihp@jetson-nano:~$ cat /tmp/tmp.ibNRS3tjPV/Project.toml 
[deps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"

[extras]
CUDA_Runtime_jll = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
ihp@jetson-nano:~$ cat /tmp/tmp.ibNRS3tjPV/LocalPreferences.toml 
[CUDA_Runtime_jll]
version = "local"

ihp@jetson-nano:~$ JULIA_DEBUG=CUDA_Runtime_Discovery julia --threads=auto --project=/tmp/tmp.ibNRS3tjPV
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0-beta3 (2023-10-03)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using CUDA
┌ Debug: Looking for binary ptxas in no specific location
│   all_locations = String[]
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Did not find ptxas
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:170
┌ Debug: Looking for library cudart, no specific version, in no specific location
│   all_names =
│    1-element Vector{String}:
│     "libcudart.so"
│   all_locations = String[]
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:128
┌ Debug: Found libcudart.so at /usr/local/cuda-10.2/targets/aarch64-linux/lib
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:137
┌ Debug: Looking for CUDA toolkit via CUDA runtime library
│   path = "/usr/local/cuda-10.2/targets/aarch64-linux/lib/libcudart.so"
│   dir = "/usr/local/cuda-10.2/targets/aarch64-linux"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:311
┌ Debug: Looking for CUDA toolkit via default installation directories
│   dirs =
│    1-element Vector{String}:
│     "/usr/local/cuda-10.2"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:338
┌ Debug: Found CUDA toolkit at /usr/local/cuda-10.2/targets/aarch64-linux, /usr/local/cuda-10.2
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:344
┌ Debug: Looking for binary ptxas in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/ptxas at /usr/local/cuda-10.2/bin/ptxas
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary nvdisasm in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/nvdisasm at /usr/local/cuda-10.2/bin/nvdisasm
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary nvlink in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/nvlink at /usr/local/cuda-10.2/bin/nvlink
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary compute-sanitizer in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2 or /usr/local/cuda-10.2/extras/compute-sanitizer
│   all_locations =
│    6-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
│     "/usr/local/cuda-10.2/extras/compute-sanitizer"
│     "/usr/local/cuda-10.2/extras/compute-sanitizer/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Did not find compute-sanitizer
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:170
┌ Debug: Could not discover CUDA toolkit
│   exception =
│    Could not find binary 'compute-sanitizer' in your local CUDA installation.
...

@maleadt maleadt changed the title Tests on CUDA 10.2 on aarch64-linux-gnu fails on assertion in GPUCompiler emit_function! Bring back CUDA 10.2 support for Jetson Nano Nov 3, 2023
@maleadt
Copy link
Member

maleadt commented Nov 3, 2023

IIRC compute-sanitizer was only added in 11.0, it used to be memory-sanitizer. I guess we can make it optional again.

@stemann
Copy link
Author

stemann commented Nov 3, 2023

IIRC compute-sanitizer was only added in 11.0, it used to be memory-sanitizer. I guess we can make it optional again.

Right - there's neither a cuda-sanitizer to be found via APT, nor a file mentioned in the build log for aarch64-linux-gnu: https://buildkite.com/julialang/yggdrasil/builds/6373#018b9530-dfb8-4278-85e1-fe37fd85c0ec

@stemann stemann changed the title Bring back CUDA 10.2 support for Jetson Nano Bring back CUDA 10.2 support for Jetson Nano and TX2 NX Nov 3, 2023
@maleadt
Copy link
Member

maleadt commented Nov 3, 2023

Sorry, the old tool was called cuda-memcheck. But I would just leave it out and make compute-sanitizer optional in CUDA_Discovery_jll (as well as the uses in CUDA.jl).

@maleadt maleadt added enhancement New feature or request help wanted Extra attention is needed and removed bug Something isn't working labels Jan 9, 2024
@maleadt
Copy link
Member

maleadt commented Jan 9, 2024

CUDA.jl now uses [email protected], which includes support for CUDA 10.2 again. I'll leave the updates to CUDA.jl itself to somebody with such hardware, though.

@maleadt maleadt changed the title Bring back CUDA 10.2 support for Jetson Nano and TX2 NX Support for Jetson Nano and TX2 NX (CUDA 10.2) Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants