Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int128 support #974

Open
maleadt opened this issue Jun 14, 2021 · 4 comments
Open

Int128 support #974

maleadt opened this issue Jun 14, 2021 · 4 comments
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request

Comments

@maleadt
Copy link
Member

maleadt commented Jun 14, 2021

CUDA GPUs do not natively support Int128 operations. LLVM supports lowering code that works with Int128, https://reviews.llvm.org/rGb9fc48da832654a2b486adaa790ceaa6dba94455, but requires compiler intrinsics for many operations:

julia> using CUDA

julia> x = widen.(CuArray(rand(Int64, 10)))
10-element CuArray{Int128, 1}:
  ...

julia> (x, x)
ERROR: LLVM error: Undefined external symbol "__divti3"

With https://reviews.llvm.org/D34708, it should be possible to resolve those intrinsics in the current module, so we can just add them to our runtime library.

@maleadt maleadt added enhancement New feature or request cuda kernels Stuff about writing CUDA kernels. labels Jun 14, 2021
@maleadt
Copy link
Member Author

maleadt commented Jun 14, 2021

Alternatively, we could build compiler-rt for NVPTX and ship that in CUDA.jl like we do with libdevice.

@maleadt
Copy link
Member Author

maleadt commented Apr 27, 2024

Another MWE from #793:

julia> using CUDA

julia> A = zeros(3) |> CuArray
3-element CuArray{Float64, 1}:
 0.0
 0.0
 0.0

julia> A .= UInt128(5)

@nsajko
Copy link
Contributor

nsajko commented Jan 25, 2025

Another one:

CUDA.fill(Int128(7),    1000)  # seems to work
CUDA.fill((Int128(7),), 1000)  # segfaults in LLVM with both Julia v1.11 and Julia v1.12

MRE:

./julia-6cd750ddf7/bin/julia -g2 -e 'using CUDA; CUDA.fill((Int128(7),), 1000)'

Segfault backtrace:

[16353] signal 11 (1): Segmentation fault
in expression starting at none:1
_ZN4llvm12SelectionDAG22FoldConstantArithmeticEjRKNS_5SDLocENS_3EVTENS_8ArrayRefINS_7SDValueEEE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm12SelectionDAG7getNodeEjRKNS_5SDLocENS_3EVTENS_7SDValueENS_11SDNodeFlagsE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZL16getCopyFromPartsRN4llvm12SelectionDAGERKNS_5SDLocEPKNS_7SDValueEjNS_3MVTENS_3EVTEPKNS_5ValueES5_St8optionalIjESD_INS_3ISD8NodeTypeEE.isra.0 at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm16SelectionDAGISel14LowerArgumentsERKNS_8FunctionE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm16SelectionDAGISel20SelectAllBasicBlocksERKNS_8FunctionE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm16SelectionDAGISel20runOnMachineFunctionERNS_15MachineFunctionE.part.0 at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE.part.0 at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
_ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/bin/../lib/julia/libLLVM.so.18.1jl (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/nsajko/.julia/packages/LLVM/b3kFs/lib/18/libLLVM.jl:11531 [inlined]
emit at /home/nsajko/.julia/packages/LLVM/b3kFs/src/targetmachine.jl:118
mcgen at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/mcgen.jl:75 [inlined]
mcgen at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/compilation.jl:127
jfptr_mcgen_16086 at /home/nsajko/.julia/compiled/v1.12/CUDA/oWw5k_MfWR6.so (unknown line)
macro expansion at /home/nsajko/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:253 [inlined]
macro expansion at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:403 [inlined]
macro expansion at /home/nsajko/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:253 [inlined]
#emit_asm#145 at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:400
emit_asm at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:393 [inlined]
#codegen#109 at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:120
codegen at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:82 [inlined]
#compile#108 at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:79
compile at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:74
#compile##0 at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/compilation.jl:250 [inlined]
#JuliaContext#107 at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:34
jfptr_YY.JuliaContextYY.107_14990 at /home/nsajko/.julia/compiled/v1.12/CUDA/oWw5k_MfWR6.so (unknown line)
JuliaContext at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/driver.jl:25
compile at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/compilation.jl:249
actual_compilation at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/execution.jl:237
jfptr_actual_compilation_14810 at /home/nsajko/.julia/compiled/v1.12/CUDA/oWw5k_MfWR6.so (unknown line)
cached_compilation at /home/nsajko/.julia/packages/GPUCompiler/Nxf8r/src/execution.jl:151
macro expansion at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/execution.jl:380 [inlined]
macro expansion at ./lock.jl:376 [inlined]
#cufunction#708 at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/execution.jl:375
cufunction at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/execution.jl:372
unknown function (ip: 0x790eba793a7c) at (unknown file)
macro expansion at /home/nsajko/.julia/packages/CUDA/1kIOw/src/compiler/execution.jl:112 [inlined]
#_#3 at /home/nsajko/.julia/packages/CUDA/1kIOw/src/CUDAKernels.jl:103
jl_apply at /cache/build/tester-demeter6-14/julialang/julia-master/src/julia.h:2245 [inlined]
do_apply at /cache/build/tester-demeter6-14/julialang/julia-master/src/builtins.c:839
Kernel at /home/nsajko/.julia/packages/CUDA/1kIOw/src/CUDAKernels.jl:89
fill! at /home/nsajko/.julia/packages/GPUArrays/Mot2g/src/host/construction.jl:22
fill at /home/nsajko/.julia/packages/CUDA/1kIOw/src/array.jl:777
unknown function (ip: 0x790eba78d9b9) at (unknown file)
jl_apply at /cache/build/tester-demeter6-14/julialang/julia-master/src/julia.h:2245 [inlined]
do_call at /cache/build/tester-demeter6-14/julialang/julia-master/src/interpreter.c:125
eval_value at /cache/build/tester-demeter6-14/julialang/julia-master/src/interpreter.c:243
eval_stmt_value at /cache/build/tester-demeter6-14/julialang/julia-master/src/interpreter.c:194 [inlined]
eval_body at /cache/build/tester-demeter6-14/julialang/julia-master/src/interpreter.c:684
jl_interpret_toplevel_thunk at /cache/build/tester-demeter6-14/julialang/julia-master/src/interpreter.c:889
jl_toplevel_eval_flex at /cache/build/tester-demeter6-14/julialang/julia-master/src/toplevel.c:1102
jl_toplevel_eval_flex at /cache/build/tester-demeter6-14/julialang/julia-master/src/toplevel.c:1042
jl_toplevel_eval_flex at /cache/build/tester-demeter6-14/julialang/julia-master/src/toplevel.c:1042
ijl_toplevel_eval at /cache/build/tester-demeter6-14/julialang/julia-master/src/toplevel.c:1114
ijl_toplevel_eval_in at /cache/build/tester-demeter6-14/julialang/julia-master/src/toplevel.c:1159
eval at ./boot.jl:486
exec_options at ./client.jl:295
_start at ./client.jl:557
jfptr__start_57584.1 at /home/nsajko/tmp/jl/jl/julia-6cd750ddf7/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/tester-demeter6-14/julialang/julia-master/src/julia.h:2245 [inlined]
true_main at /cache/build/tester-demeter6-14/julialang/julia-master/src/jlapi.c:924
jl_repl_entrypoint at /cache/build/tester-demeter6-14/julialang/julia-master/src/jlapi.c:1084
main at /cache/build/tester-demeter6-14/julialang/julia-master/cli/loader_exe.c:58
unknown function (ip: 0x790ec19f5e07) at /usr/lib/libc.so.6
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8) at /workspace/srcdir/glibc-2.17/csu/../sysdeps/x86_64/start.S
Allocations: 22607865 (Pool: 22607426; Big: 439); GC: 27
Segmentation fault (core dumped)

@maleadt
Copy link
Member Author

maleadt commented Jan 27, 2025

This is probably a selection failure; would be good to run with LLVM assertions enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants