-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVPTX i128 support broken on LLVM 11 / Julia 1.6 #793
Comments
Underlying LLVM assertion:
Module for which this occurs.; ModuleID = 'text'
source_filename = "text"
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
%printf_args.0 = type { i64 }
@exception7 = private unnamed_addr addrspace(1) constant [12 x i8] c"BoundsError\00", align 1
@exception_flag = weak local_unnamed_addr addrspace(1) externally_initialized global i64 0
@0 = private unnamed_addr addrspace(1) constant [108 x i8] c"ERROR: a %s was thrown during kernel execution.\0A Run Julia on debug level 2 for device stack traces.\0A\00", align 1
@1 = private unnamed_addr addrspace(1) constant [110 x i8] c"WARNING: could not signal exception status to the host, execution will continue.\0A Please file a bug.\0A\00", align 1
; Function Attrs: nounwind readnone speculatable willreturn
declare i128 @llvm.ctlz.i128(i128 %0, i1 immarg %1) #0
; Function Attrs: nounwind readnone speculatable willreturn
declare i128 @llvm.cttz.i128(i128 %0, i1 immarg %1) #0
; Function Attrs: nounwind readnone
declare i32 @llvm.nvvm.read.ptx.sreg.tid.x() #1
; Function Attrs: nounwind readnone
declare i32 @llvm.nvvm.read.ptx.sreg.ctaid.x() #1
; Function Attrs: nounwind readnone
declare i32 @llvm.nvvm.read.ptx.sreg.ntid.x() #1
; Function Attrs: nounwind readnone
declare i32 @llvm.nvvm.read.ptx.sreg.nctaid.x() #1
define ptx_kernel void @_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_({ [1 x i64], i8 addrspace(1)* } %0, { [1 x i128], [1 x [1 x i64]] } %1, i64 signext %2) local_unnamed_addr {
entry:
%.fca.0.0.extract6 = extractvalue { [1 x i64], i8 addrspace(1)* } %0, 0, 0
%.fca.0.0.extract = extractvalue { [1 x i128], [1 x [1 x i64]] } %1, 0, 0
%.inv = icmp sgt i64 %2, 0, !dbg !42
%3 = select i1 %.inv, i64 %2, i64 0, !dbg !42
br i1 %.inv, label %L12.i.preheader, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, !dbg !50
L12.i.preheader: ; preds = %entry
%.fca.1.extract = extractvalue { [1 x i64], i8 addrspace(1)* } %0, 1
%4 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x(), !dbg !52, !range !75
%narrow = add nuw nsw i32 %4, 1, !dbg !76
%5 = zext i32 %narrow to i64, !dbg !76
%6 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x(), !dbg !79, !range !88
%7 = zext i32 %6 to i64, !dbg !89
%8 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x(), !dbg !94, !range !103
%9 = zext i32 %8 to i64, !dbg !104
%10 = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.x(), !dbg !106, !range !117
%11 = zext i32 %10 to i64, !dbg !118
%12 = icmp sgt i64 %.fca.0.0.extract6, 0, !dbg !120
%13 = select i1 %12, i64 %.fca.0.0.extract6, i64 0, !dbg !120
%.not20 = icmp eq i128 %.fca.0.0.extract, 0, !dbg !137
%14 = call i128 @llvm.ctlz.i128(i128 %.fca.0.0.extract, i1 true), !dbg !156, !range !159
%15 = trunc i128 %14 to i64, !dbg !160
%16 = trunc i128 %.fca.0.0.extract to i64, !dbg !163
%17 = add nsw i64 %15, -75, !dbg !165
%18 = shl i64 %16, %17, !dbg !167
%19 = icmp ugt i64 %17, 63, !dbg !167
%.op = and i64 %18, 4503599627370495, !dbg !170
%20 = select i1 %19, i64 0, i64 %.op, !dbg !170
%21 = sub nsw i64 74, %15, !dbg !172
%22 = zext i64 %21 to i128, !dbg !174
%23 = lshr i128 %.fca.0.0.extract, %22, !dbg !174
%24 = icmp ugt i64 %21, 127, !dbg !174
%25 = trunc i128 %23 to i64, !dbg !177
%.op22 = and i64 %25, 9007199254740991, !dbg !178
%.op22.op = add nuw nsw i64 %.op22, 1, !dbg !179
%.op22.op.op = lshr i64 %.op22.op, 1, !dbg !182
%26 = select i1 %24, i64 0, i64 %.op22.op.op, !dbg !182
%27 = call i128 @llvm.cttz.i128(i128 %.fca.0.0.extract, i1 true), !dbg !184, !range !159
%28 = trunc i128 %27 to i64, !dbg !187
%29 = icmp eq i64 %21, %28, !dbg !189
%30 = zext i1 %29 to i64, !dbg !190
%31 = xor i64 %30, -1, !dbg !194
%32 = and i64 %26, %31, !dbg !196
%33 = shl nuw nsw i64 %15, 52, !dbg !197
%34 = sub nuw nsw i64 5179139571476070400, %33, !dbg !197
br i1 %.not20, label %L12.i.us.preheader, label %L12.i.preheader.L12.i.preheader.split_crit_edge, !dbg !200
L12.i.us.preheader: ; preds = %L12.i.preheader
%35 = mul i64 %9, %7, !dbg !200
%36 = sub i64 %5, 1, !dbg !200
%37 = add i64 %35, %36, !dbg !200
%38 = shl nuw nsw i64 %37, 3, !dbg !200
%scevgep = getelementptr i8, i8 addrspace(1)* %.fca.1.extract, i64 %38, !dbg !200
%39 = mul i64 %11, %9, !dbg !200
%40 = shl i64 %39, 3, !dbg !200
%41 = add i64 %37, 1, !dbg !200
br label %L12.i.us, !dbg !200
L12.i.preheader.L12.i.preheader.split_crit_edge: ; preds = %L12.i.preheader
%42 = icmp ult i64 %15, 75, !dbg !201
br i1 %42, label %L12.i.us23.preheader, label %L12.i.preheader2, !dbg !200
L12.i.preheader2: ; preds = %L12.i.preheader.L12.i.preheader.split_crit_edge
%43 = or i64 %34, %20, !dbg !204
%44 = mul i64 %9, %7, !dbg !200
%45 = sub i64 %5, 1, !dbg !200
%46 = add i64 %44, %45, !dbg !200
%47 = shl nuw nsw i64 %46, 3, !dbg !200
%scevgep26 = getelementptr i8, i8 addrspace(1)* %.fca.1.extract, i64 %47, !dbg !200
%48 = mul i64 %11, %9, !dbg !200
%49 = shl i64 %48, 3, !dbg !200
%50 = add i64 %46, 1, !dbg !200
br label %L12.i, !dbg !200
L12.i.us23.preheader: ; preds = %L12.i.preheader.L12.i.preheader.split_crit_edge
%51 = add nuw i64 %34, %32, !dbg !204
%52 = mul i64 %9, %7, !dbg !200
%53 = sub i64 %5, 1, !dbg !200
%54 = add i64 %52, %53, !dbg !200
%55 = shl nuw nsw i64 %54, 3, !dbg !200
%scevgep17 = getelementptr i8, i8 addrspace(1)* %.fca.1.extract, i64 %55, !dbg !200
%56 = mul i64 %11, %9, !dbg !200
%57 = shl i64 %56, 3, !dbg !200
%58 = add i64 %54, 1, !dbg !200
br label %L12.i.us23, !dbg !200
L12.i.us: ; preds = %L12.i.us.preheader, %L182.i.us
%lsr.iv13 = phi i64 [ %41, %L12.i.us.preheader ], [ %lsr.iv.next14, %L182.i.us ]
%lsr.iv9 = phi i8 addrspace(1)* [ %scevgep, %L12.i.us.preheader ], [ %64, %L182.i.us ]
%lsr.iv = phi i64 [ %3, %L12.i.us.preheader ], [ %lsr.iv.next, %L182.i.us ]
%.not.us = icmp slt i64 %.fca.0.0.extract6, %lsr.iv13, !dbg !206
br i1 %.not.us, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L78.i.us, !dbg !200
L78.i.us: ; preds = %L12.i.us
%59 = icmp slt i64 %lsr.iv13, 1, !dbg !211
%60 = icmp sgt i64 %lsr.iv13, %13, !dbg !231
%61 = or i1 %59, %60, !dbg !213
br i1 %61, label %L89.i.us, label %L182.i.us, !dbg !213
L89.i.us: ; preds = %L78.i.us
call fastcc void @gpu_report_exception(i64 ptrtoint ([12 x i8]* addrspacecast ([12 x i8] addrspace(1)* @exception7 to [12 x i8]*) to i64)), !dbg !213
call fastcc void @gpu_signal_exception(), !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
br label %L182.i.us
L182.i.us: ; preds = %L89.i.us, %L78.i.us
%62 = bitcast i8 addrspace(1)* %lsr.iv9 to i1 addrspace(1)*
%63 = bitcast i8 addrspace(1)* %lsr.iv9 to double addrspace(1)*
store double 0.000000e+00, double addrspace(1)* %63, align 8, !dbg !232, !tbaa !242
%lsr.iv.next = add nsw i64 %lsr.iv, -1, !dbg !245
%scevgep11 = getelementptr i1, i1 addrspace(1)* %62, i64 %40, !dbg !245
%64 = bitcast i1 addrspace(1)* %scevgep11 to i8 addrspace(1)*, !dbg !245
%lsr.iv.next14 = add i64 %lsr.iv13, %39, !dbg !245
%.not21.not.us = icmp eq i64 %lsr.iv.next, 0, !dbg !245
br i1 %.not21.not.us, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L12.i.us, !dbg !155
L12.i.us23: ; preds = %L12.i.us23.preheader, %L182.i.us34
%lsr.iv22 = phi i64 [ %58, %L12.i.us23.preheader ], [ %lsr.iv.next23, %L182.i.us34 ]
%lsr.iv18 = phi i8 addrspace(1)* [ %scevgep17, %L12.i.us23.preheader ], [ %70, %L182.i.us34 ]
%lsr.iv15 = phi i64 [ %3, %L12.i.us23.preheader ], [ %lsr.iv.next16, %L182.i.us34 ]
%.not.us25 = icmp slt i64 %.fca.0.0.extract6, %lsr.iv22, !dbg !206
br i1 %.not.us25, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L78.i.us26, !dbg !200
L78.i.us26: ; preds = %L12.i.us23
%65 = icmp slt i64 %lsr.iv22, 1, !dbg !211
%66 = icmp sgt i64 %lsr.iv22, %13, !dbg !231
%67 = or i1 %65, %66, !dbg !213
br i1 %67, label %L89.i.us27, label %L182.i.us34, !dbg !213
L89.i.us27: ; preds = %L78.i.us26
call fastcc void @gpu_report_exception(i64 ptrtoint ([12 x i8]* addrspacecast ([12 x i8] addrspace(1)* @exception7 to [12 x i8]*) to i64)), !dbg !213
call fastcc void @gpu_signal_exception(), !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
br label %L182.i.us34
L182.i.us34: ; preds = %L89.i.us27, %L78.i.us26
%68 = bitcast i8 addrspace(1)* %lsr.iv18 to i1 addrspace(1)*
%69 = bitcast i8 addrspace(1)* %lsr.iv18 to i64 addrspace(1)*
store i64 %51, i64 addrspace(1)* %69, align 8, !dbg !232, !tbaa !242
%lsr.iv.next16 = add nsw i64 %lsr.iv15, -1, !dbg !245
%scevgep20 = getelementptr i1, i1 addrspace(1)* %68, i64 %57, !dbg !245
%70 = bitcast i1 addrspace(1)* %scevgep20 to i8 addrspace(1)*, !dbg !245
%lsr.iv.next23 = add i64 %lsr.iv22, %56, !dbg !245
%.not21.not.us36 = icmp eq i64 %lsr.iv.next16, 0, !dbg !245
br i1 %.not21.not.us36, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L12.i.us23, !dbg !155
L12.i: ; preds = %L12.i.preheader2, %L182.i
%lsr.iv31 = phi i64 [ %50, %L12.i.preheader2 ], [ %lsr.iv.next32, %L182.i ]
%lsr.iv27 = phi i8 addrspace(1)* [ %scevgep26, %L12.i.preheader2 ], [ %76, %L182.i ]
%lsr.iv24 = phi i64 [ %3, %L12.i.preheader2 ], [ %lsr.iv.next25, %L182.i ]
%.not = icmp slt i64 %.fca.0.0.extract6, %lsr.iv31, !dbg !206
br i1 %.not, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L78.i, !dbg !200
L78.i: ; preds = %L12.i
%71 = icmp slt i64 %lsr.iv31, 1, !dbg !211
%72 = icmp sgt i64 %lsr.iv31, %13, !dbg !231
%73 = or i1 %71, %72, !dbg !213
br i1 %73, label %L89.i, label %L182.i, !dbg !213
L89.i: ; preds = %L78.i
call fastcc void @gpu_report_exception(i64 ptrtoint ([12 x i8]* addrspacecast ([12 x i8] addrspace(1)* @exception7 to [12 x i8]*) to i64)), !dbg !213
call fastcc void @gpu_signal_exception(), !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
call void asm sideeffect "trap;", ""() #3, !dbg !213
br label %L182.i
L182.i: ; preds = %L89.i, %L78.i
%74 = bitcast i8 addrspace(1)* %lsr.iv27 to i1 addrspace(1)*
%75 = bitcast i8 addrspace(1)* %lsr.iv27 to i64 addrspace(1)*
store i64 %43, i64 addrspace(1)* %75, align 8, !dbg !232, !tbaa !242
%lsr.iv.next25 = add nsw i64 %lsr.iv24, -1, !dbg !245
%scevgep29 = getelementptr i1, i1 addrspace(1)* %74, i64 %49, !dbg !245
%76 = bitcast i1 addrspace(1)* %scevgep29 to i8 addrspace(1)*, !dbg !245
%lsr.iv.next32 = add i64 %lsr.iv31, %48, !dbg !245
%.not21.not = icmp eq i64 %lsr.iv.next25, 0, !dbg !245
br i1 %.not21.not, label %_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit, label %L12.i, !dbg !155
_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_.inner.exit: ; preds = %L182.i, %L12.i, %L182.i.us34, %L12.i.us23, %L182.i.us, %L12.i.us, %entry
ret void
}
define internal fastcc void @gpu_report_exception(i64 zeroext %0) unnamed_addr !dbg !248 {
top:
%1 = alloca %printf_args.0, align 8
%2 = addrspacecast %printf_args.0* %1 to %printf_args.0 addrspace(5)*
%3 = bitcast %printf_args.0* %1 to i8*, !dbg !249
call void @llvm.lifetime.start.p0i8(i64 8, i8* nonnull %3), !dbg !249
%4 = bitcast %printf_args.0 addrspace(5)* %2 to i64 addrspace(5)*
store i64 %0, i64 addrspace(5)* %4, align 8, !dbg !249
%5 = call i32 @vprintf(i8* getelementptr ([108 x i8], [108 x i8]* addrspacecast ([108 x i8] addrspace(1)* @0 to [108 x i8]*), i64 0, i64 0), i8* nonnull %3), !dbg !249
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %3), !dbg !249
ret void, !dbg !257
}
; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg %0, i8* nocapture %1) #2
declare i32 @vprintf(i8* %0, i8* %1) local_unnamed_addr
; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.lifetime.end.p0i8(i64 immarg %0, i8* nocapture %1) #2
define internal fastcc void @gpu_signal_exception() unnamed_addr !dbg !258 {
top:
%ptr.i = load i64, i64 addrspace(1)* @exception_flag, align 8, !dbg !259
%.not = icmp eq i64 %ptr.i, 0, !dbg !262
br i1 %.not, label %L10, label %L6, !dbg !262
L6: ; preds = %top
%0 = inttoptr i64 %ptr.i to i64*, !dbg !263
store i64 1, i64* %0, align 1, !dbg !263, !tbaa !268
call void @llvm.nvvm.membar.sys(), !dbg !272
br label %L13, !dbg !275
L10: ; preds = %top
%1 = call i32 @vprintf(i8* getelementptr ([110 x i8], [110 x i8]* addrspacecast ([110 x i8] addrspace(1)* @1 to [110 x i8]*), i64 0, i64 0), i8* null), !dbg !276
br label %L13, !dbg !276
L13: ; preds = %L10, %L6
ret void, !dbg !283
}
; Function Attrs: nounwind
declare void @llvm.nvvm.membar.sys() #3
; Function Attrs: nounwind
declare void @llvm.stackprotector(i8* %0, i8** %1) #3
attributes #0 = { nounwind readnone speculatable willreturn }
attributes #1 = { nounwind readnone }
attributes #2 = { argmemonly nounwind willreturn }
attributes #3 = { nounwind }
!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2, !5, !7, !8, !9, !11, !12, !13, !15, !16, !17, !18, !19, !20, !21, !22, !23, !24, !25, !26, !27, !28, !29, !30, !31, !32, !33, !34, !35, !36, !37, !38, !40}
!nvvm.annotations = !{!41}
!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 1, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!3 = !DIFile(filename: "/home/tim/Julia/pkg/GPUArrays/src/host/broadcast.jl", directory: ".")
!4 = !{}
!5 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !6, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!6 = !DIFile(filename: "abstractarray.jl", directory: ".")
!7 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !6, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!8 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !6, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!9 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!10 = !DIFile(filename: "/home/tim/Julia/pkg/GPUCompiler/src/runtime.jl", directory: ".")
!11 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!12 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!13 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !14, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!14 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/runtime.jl", directory: ".")
!15 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!16 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!17 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !14, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!18 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!19 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!20 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !14, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!21 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!22 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !14, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!23 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!24 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!25 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!26 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!27 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!28 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!29 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!30 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!31 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!32 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!33 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!34 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!35 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!36 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!37 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !10, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!38 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !39, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!39 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/intrinsics/memory_dynamic.jl", directory: ".")
!40 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !14, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !4, nameTableKind: None)
!41 = !{void ({ [1 x i64], i8 addrspace(1)* }, { [1 x i128], [1 x [1 x i64]] }, i64)* @_Z27julia_broadcast_kernel_193315CuKernelContext13CuDeviceArrayI7Float64Li1ELi1EE11BroadcastedIv5TupleI5OneToI5Int64EE9_identityS3_I7UInt128EES5_, !"kernel", i32 1}
!42 = !DILocation(line: 292, scope: !43, inlinedAt: !46)
!43 = distinct !DISubprogram(name: "unitrange_last;", linkageName: "unitrange_last", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!44 = !DIFile(filename: "range.jl", directory: ".")
!45 = !DISubroutineType(types: !4)
!46 = !DILocation(line: 287, scope: !47, inlinedAt: !48)
!47 = distinct !DISubprogram(name: "UnitRange;", linkageName: "UnitRange", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!48 = !DILocation(line: 5, scope: !49, inlinedAt: !50)
!49 = distinct !DISubprogram(name: "Colon;", linkageName: "Colon", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!50 = !DILocation(line: 57, scope: !51)
!51 = distinct !DISubprogram(name: "broadcast_kernel", linkageName: "julia_broadcast_kernel_1933", scope: null, file: !3, line: 56, type: !45, scopeLine: 56, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!52 = !DILocation(line: 0, scope: !53, inlinedAt: !55)
!53 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !54, file: !54, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!54 = !DIFile(filename: "/home/tim/Julia/pkg/LLVM/src/interop/base.jl", directory: ".")
!55 = !DILocation(line: 7, scope: !56, inlinedAt: !58)
!56 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!57 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/intrinsics/indexing.jl", directory: ".")
!58 = !DILocation(line: 7, scope: !59, inlinedAt: !60)
!59 = distinct !DISubprogram(name: "_index;", linkageName: "_index", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!60 = !DILocation(line: 47, scope: !61, inlinedAt: !62)
!61 = distinct !DISubprogram(name: "threadIdx_x;", linkageName: "threadIdx_x", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!62 = !DILocation(line: 91, scope: !63, inlinedAt: !64)
!63 = distinct !DISubprogram(name: "threadIdx;", linkageName: "threadIdx", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!64 = !DILocation(line: 40, scope: !65, inlinedAt: !67)
!65 = distinct !DISubprogram(name: "threadidx;", linkageName: "threadidx", scope: !66, file: !66, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!66 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/gpuarrays.jl", directory: ".")
!67 = !DILocation(line: 20, scope: !68, inlinedAt: !70)
!68 = distinct !DISubprogram(name: "global_index;", linkageName: "global_index", scope: !69, file: !69, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!69 = !DIFile(filename: "/home/tim/Julia/pkg/GPUArrays/src/device/indexing.jl", directory: ".")
!70 = !DILocation(line: 44, scope: !71, inlinedAt: !72)
!71 = distinct !DISubprogram(name: "linear_index;", linkageName: "linear_index", scope: !69, file: !69, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!72 = !DILocation(line: 66, scope: !73, inlinedAt: !74)
!73 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !69, file: !69, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!74 = !DILocation(line: 58, scope: !51)
!75 = !{i32 0, i32 1023}
!76 = !DILocation(line: 0, scope: !77, inlinedAt: !60)
!77 = distinct !DISubprogram(name: "+;", linkageName: "+", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!78 = !DIFile(filename: "int.jl", directory: ".")
!79 = !DILocation(line: 0, scope: !53, inlinedAt: !80)
!80 = !DILocation(line: 7, scope: !56, inlinedAt: !81)
!81 = !DILocation(line: 7, scope: !59, inlinedAt: !82)
!82 = !DILocation(line: 57, scope: !83, inlinedAt: !84)
!83 = distinct !DISubprogram(name: "blockIdx_x;", linkageName: "blockIdx_x", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!84 = !DILocation(line: 77, scope: !85, inlinedAt: !86)
!85 = distinct !DISubprogram(name: "blockIdx;", linkageName: "blockIdx", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!86 = !DILocation(line: 38, scope: !87, inlinedAt: !67)
!87 = distinct !DISubprogram(name: "blockidx;", linkageName: "blockidx", scope: !66, file: !66, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!88 = !{i32 0, i32 2147483646}
!89 = !DILocation(line: 0, scope: !90, inlinedAt: !92)
!90 = distinct !DISubprogram(name: "toInt64;", linkageName: "toInt64", scope: !91, file: !91, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!91 = !DIFile(filename: "boot.jl", directory: ".")
!92 = !DILocation(line: 752, scope: !93, inlinedAt: !82)
!93 = distinct !DISubprogram(name: "Int64;", linkageName: "Int64", scope: !91, file: !91, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!94 = !DILocation(line: 0, scope: !53, inlinedAt: !95)
!95 = !DILocation(line: 7, scope: !56, inlinedAt: !96)
!96 = !DILocation(line: 7, scope: !59, inlinedAt: !97)
!97 = !DILocation(line: 52, scope: !98, inlinedAt: !99)
!98 = distinct !DISubprogram(name: "blockDim_x;", linkageName: "blockDim_x", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!99 = !DILocation(line: 84, scope: !100, inlinedAt: !101)
!100 = distinct !DISubprogram(name: "blockDim;", linkageName: "blockDim", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!101 = !DILocation(line: 39, scope: !102, inlinedAt: !67)
!102 = distinct !DISubprogram(name: "blockdim;", linkageName: "blockdim", scope: !66, file: !66, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!103 = !{i32 1, i32 1024}
!104 = !DILocation(line: 0, scope: !90, inlinedAt: !105)
!105 = !DILocation(line: 752, scope: !93, inlinedAt: !97)
!106 = !DILocation(line: 0, scope: !53, inlinedAt: !107)
!107 = !DILocation(line: 7, scope: !56, inlinedAt: !108)
!108 = !DILocation(line: 7, scope: !59, inlinedAt: !109)
!109 = !DILocation(line: 62, scope: !110, inlinedAt: !111)
!110 = distinct !DISubprogram(name: "gridDim_x;", linkageName: "gridDim_x", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!111 = !DILocation(line: 70, scope: !112, inlinedAt: !113)
!112 = distinct !DISubprogram(name: "gridDim;", linkageName: "gridDim", scope: !57, file: !57, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!113 = !DILocation(line: 41, scope: !114, inlinedAt: !115)
!114 = distinct !DISubprogram(name: "griddim;", linkageName: "griddim", scope: !66, file: !66, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!115 = !DILocation(line: 29, scope: !116, inlinedAt: !70)
!116 = distinct !DISubprogram(name: "global_size;", linkageName: "global_size", scope: !69, file: !69, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!117 = !{i32 1, i32 2147483647}
!118 = !DILocation(line: 0, scope: !90, inlinedAt: !119)
!119 = !DILocation(line: 752, scope: !93, inlinedAt: !109)
!120 = !DILocation(line: 0, scope: !121, inlinedAt: !123)
!121 = distinct !DISubprogram(name: "max;", linkageName: "max", scope: !122, file: !122, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!122 = !DIFile(filename: "promotion.jl", directory: ".")
!123 = !DILocation(line: 326, scope: !124, inlinedAt: !125)
!124 = distinct !DISubprogram(name: "OneTo;", linkageName: "OneTo", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!125 = !DILocation(line: 335, scope: !124, inlinedAt: !126)
!126 = !DILocation(line: 337, scope: !127, inlinedAt: !128)
!127 = distinct !DISubprogram(name: "oneto;", linkageName: "oneto", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!128 = !DILocation(line: 213, scope: !129, inlinedAt: !131)
!129 = distinct !DISubprogram(name: "map;", linkageName: "map", scope: !130, file: !130, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!130 = !DIFile(filename: "tuple.jl", directory: ".")
!131 = !DILocation(line: 89, scope: !132, inlinedAt: !133)
!132 = distinct !DISubprogram(name: "axes;", linkageName: "axes", scope: !6, file: !6, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!133 = !DILocation(line: 279, scope: !134, inlinedAt: !136)
!134 = distinct !DISubprogram(name: "CartesianIndices;", linkageName: "CartesianIndices", scope: !135, file: !135, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!135 = !DIFile(filename: "multidimensional.jl", directory: ".")
!136 = !DILocation(line: 81, scope: !73, inlinedAt: !74)
!137 = !DILocation(line: 0, scope: !138, inlinedAt: !139)
!138 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !122, file: !122, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!139 = !DILocation(line: 360, scope: !138, inlinedAt: !140)
!140 = !DILocation(line: 446, scope: !141, inlinedAt: !142)
!141 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!142 = !DILocation(line: 116, scope: !143, inlinedAt: !145)
!143 = distinct !DISubprogram(name: "Float64;", linkageName: "Float64", scope: !144, file: !144, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!144 = !DIFile(filename: "float.jl", directory: ".")
!145 = !DILocation(line: 7, scope: !146, inlinedAt: !148)
!146 = distinct !DISubprogram(name: "convert;", linkageName: "convert", scope: !147, file: !147, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!147 = !DIFile(filename: "number.jl", directory: ".")
!148 = !DILocation(line: 103, scope: !149, inlinedAt: !151)
!149 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !150, file: !150, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!150 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/array.jl", directory: ".")
!151 = !DILocation(line: 1286, scope: !152, inlinedAt: !153)
!152 = distinct !DISubprogram(name: "_setindex!;", linkageName: "_setindex!", scope: !6, file: !6, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!153 = !DILocation(line: 1267, scope: !154, inlinedAt: !155)
!154 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !6, file: !6, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!155 = !DILocation(line: 59, scope: !51)
!156 = !DILocation(line: 0, scope: !157, inlinedAt: !158)
!157 = distinct !DISubprogram(name: "leading_zeros;", linkageName: "leading_zeros", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!158 = !DILocation(line: 117, scope: !143, inlinedAt: !145)
!159 = !{i128 0, i128 129}
!160 = !DILocation(line: 0, scope: !161, inlinedAt: !162)
!161 = distinct !DISubprogram(name: "rem;", linkageName: "rem", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!162 = !DILocation(line: 385, scope: !157, inlinedAt: !158)
!163 = !DILocation(line: 0, scope: !161, inlinedAt: !164)
!164 = !DILocation(line: 119, scope: !143, inlinedAt: !145)
!165 = !DILocation(line: 0, scope: !166, inlinedAt: !164)
!166 = distinct !DISubprogram(name: "-;", linkageName: "-", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!167 = !DILocation(line: 0, scope: !168, inlinedAt: !169)
!168 = distinct !DISubprogram(name: "<<;", linkageName: "<<", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!169 = !DILocation(line: 464, scope: !168, inlinedAt: !164)
!170 = !DILocation(line: 0, scope: !171, inlinedAt: !164)
!171 = distinct !DISubprogram(name: "&;", linkageName: "&", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!172 = !DILocation(line: 0, scope: !166, inlinedAt: !173)
!173 = !DILocation(line: 121, scope: !143, inlinedAt: !145)
!174 = !DILocation(line: 0, scope: !175, inlinedAt: !176)
!175 = distinct !DISubprogram(name: ">>;", linkageName: ">>", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!176 = !DILocation(line: 462, scope: !175, inlinedAt: !173)
!177 = !DILocation(line: 0, scope: !161, inlinedAt: !173)
!178 = !DILocation(line: 0, scope: !171, inlinedAt: !173)
!179 = !DILocation(line: 0, scope: !77, inlinedAt: !180)
!180 = !DILocation(line: 923, scope: !77, inlinedAt: !181)
!181 = !DILocation(line: 122, scope: !143, inlinedAt: !145)
!182 = !DILocation(line: 0, scope: !175, inlinedAt: !183)
!183 = !DILocation(line: 462, scope: !175, inlinedAt: !181)
!184 = !DILocation(line: 0, scope: !185, inlinedAt: !186)
!185 = distinct !DISubprogram(name: "trailing_zeros;", linkageName: "trailing_zeros", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!186 = !DILocation(line: 123, scope: !143, inlinedAt: !145)
!187 = !DILocation(line: 0, scope: !161, inlinedAt: !188)
!188 = !DILocation(line: 398, scope: !185, inlinedAt: !186)
!189 = !DILocation(line: 0, scope: !138, inlinedAt: !186)
!190 = !DILocation(line: 0, scope: !191, inlinedAt: !192)
!191 = distinct !DISubprogram(name: "toUInt64;", linkageName: "toUInt64", scope: !91, file: !91, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!192 = !DILocation(line: 757, scope: !193, inlinedAt: !186)
!193 = distinct !DISubprogram(name: "UInt64;", linkageName: "UInt64", scope: !91, file: !91, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!194 = !DILocation(line: 0, scope: !195, inlinedAt: !186)
!195 = distinct !DISubprogram(name: "~;", linkageName: "~", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!196 = !DILocation(line: 0, scope: !171, inlinedAt: !186)
!197 = !DILocation(line: 0, scope: !168, inlinedAt: !198)
!198 = !DILocation(line: 464, scope: !168, inlinedAt: !199)
!199 = !DILocation(line: 125, scope: !143, inlinedAt: !145)
!200 = !DILocation(line: 67, scope: !73, inlinedAt: !74)
!201 = !DILocation(line: 0, scope: !202, inlinedAt: !203)
!202 = distinct !DISubprogram(name: "<=;", linkageName: "<=", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!203 = !DILocation(line: 118, scope: !143, inlinedAt: !145)
!204 = !DILocation(line: 0, scope: !77, inlinedAt: !205)
!205 = !DILocation(line: 126, scope: !143, inlinedAt: !145)
!206 = !DILocation(line: 83, scope: !207, inlinedAt: !208)
!207 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !78, file: !78, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!208 = !DILocation(line: 305, scope: !209, inlinedAt: !200)
!209 = distinct !DISubprogram(name: ">;", linkageName: ">", scope: !210, file: !210, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!210 = !DIFile(filename: "operators.jl", directory: ".")
!211 = !DILocation(line: 83, scope: !207, inlinedAt: !212)
!212 = !DILocation(line: 305, scope: !209, inlinedAt: !213)
!213 = !DILocation(line: 702, scope: !214, inlinedAt: !215)
!214 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!215 = !DILocation(line: 648, scope: !216, inlinedAt: !218)
!216 = distinct !DISubprogram(name: "_broadcast_getindex_evalf;", linkageName: "_broadcast_getindex_evalf", scope: !217, file: !217, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!217 = !DIFile(filename: "broadcast.jl", directory: ".")
!218 = !DILocation(line: 621, scope: !219, inlinedAt: !220)
!219 = distinct !DISubprogram(name: "_broadcast_getindex;", linkageName: "_broadcast_getindex", scope: !217, file: !217, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!220 = !DILocation(line: 1098, scope: !221, inlinedAt: !222)
!221 = distinct !DISubprogram(name: "#19;", linkageName: "#19", scope: !217, file: !217, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!222 = !DILocation(line: 48, scope: !223, inlinedAt: !225)
!223 = distinct !DISubprogram(name: "ntuple;", linkageName: "ntuple", scope: !224, file: !224, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!224 = !DIFile(filename: "ntuple.jl", directory: ".")
!225 = !DILocation(line: 1098, scope: !226, inlinedAt: !227)
!226 = distinct !DISubprogram(name: "copy;", linkageName: "copy", scope: !217, file: !217, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!227 = !DILocation(line: 883, scope: !228, inlinedAt: !229)
!228 = distinct !DISubprogram(name: "materialize;", linkageName: "materialize", scope: !217, file: !217, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!229 = !DILocation(line: 353, scope: !230, inlinedAt: !136)
!230 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !135, file: !135, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!231 = !DILocation(line: 442, scope: !202, inlinedAt: !213)
!232 = !DILocation(line: 74, scope: !53, inlinedAt: !233)
!233 = !DILocation(line: 42, scope: !234, inlinedAt: !236)
!234 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !235, file: !235, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!235 = !DIFile(filename: "/home/tim/Julia/pkg/LLVM/src/interop/pointer.jl", directory: ".")
!236 = !DILocation(line: 42, scope: !237, inlinedAt: !238)
!237 = distinct !DISubprogram(name: "pointerset;", linkageName: "pointerset", scope: !235, file: !235, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!238 = !DILocation(line: 82, scope: !239, inlinedAt: !240)
!239 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !235, file: !235, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!240 = !DILocation(line: 88, scope: !241, inlinedAt: !148)
!241 = distinct !DISubprogram(name: "arrayset;", linkageName: "arrayset", scope: !150, file: !150, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!242 = !{!243, !243, i64 0, i64 0}
!243 = !{!"custom_tbaa_addrspace(1)", !244, i64 0}
!244 = !{!"custom_tbaa"}
!245 = !DILocation(line: 410, scope: !138, inlinedAt: !246)
!246 = !DILocation(line: 674, scope: !247, inlinedAt: !155)
!247 = distinct !DISubprogram(name: "iterate;", linkageName: "iterate", scope: !44, file: !44, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!248 = distinct !DISubprogram(name: "report_exception", linkageName: "julia_report_exception_3014", scope: null, file: !14, line: 51, type: !45, scopeLine: 51, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !13, retainedNodes: !4)
!249 = !DILocation(line: 74, scope: !250, inlinedAt: !251)
!250 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !54, file: !54, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !13, retainedNodes: !4)
!251 = !DILocation(line: 38, scope: !252, inlinedAt: !254)
!252 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !253, file: !253, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !13, retainedNodes: !4)
!253 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/intrinsics/output.jl", directory: ".")
!254 = !DILocation(line: 38, scope: !255, inlinedAt: !256)
!255 = distinct !DISubprogram(name: "_cuprintf;", linkageName: "_cuprintf", scope: !253, file: !253, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !13, retainedNodes: !4)
!256 = !DILocation(line: 52, scope: !248)
!257 = !DILocation(line: 56, scope: !248)
!258 = distinct !DISubprogram(name: "signal_exception", linkageName: "julia_signal_exception_3520", scope: null, file: !14, line: 37, type: !45, scopeLine: 37, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!259 = !DILocation(line: 27, scope: !260, inlinedAt: !261)
!260 = distinct !DISubprogram(name: "exception_flag;", linkageName: "exception_flag", scope: !14, file: !14, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!261 = !DILocation(line: 38, scope: !258)
!262 = !DILocation(line: 39, scope: !258)
!263 = !DILocation(line: 118, scope: !264, inlinedAt: !266)
!264 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !265, file: !265, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!265 = !DIFile(filename: "pointer.jl", directory: ".")
!266 = !DILocation(line: 118, scope: !264, inlinedAt: !267)
!267 = !DILocation(line: 40, scope: !258)
!268 = !{!269, !269, i64 0}
!269 = !{!"jtbaa_data", !270, i64 0}
!270 = !{!"jtbaa", !271, i64 0}
!271 = !{!"jtbaa"}
!272 = !DILocation(line: 115, scope: !273, inlinedAt: !275)
!273 = distinct !DISubprogram(name: "threadfence_system;", linkageName: "threadfence_system", scope: !274, file: !274, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!274 = !DIFile(filename: "/home/tim/Julia/pkg/CUDA/src/device/intrinsics/synchronization.jl", directory: ".")
!275 = !DILocation(line: 41, scope: !258)
!276 = !DILocation(line: 74, scope: !277, inlinedAt: !278)
!277 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !54, file: !54, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!278 = !DILocation(line: 38, scope: !279, inlinedAt: !280)
!279 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !253, file: !253, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!280 = !DILocation(line: 38, scope: !281, inlinedAt: !282)
!281 = distinct !DISubprogram(name: "_cuprintf;", linkageName: "_cuprintf", scope: !253, file: !253, type: !45, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !22, retainedNodes: !4)
!282 = !DILocation(line: 43, scope: !258)
!283 = !DILocation(line: 48, scope: !258) Reduced to: source_filename = "text"
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
define void @kernel( [1 x i128] ) {
ret void
} So that looks like a pretty serious LLVM bug we're unlikely to be able to fix from within CUDA.jl... |
Ah that's unfortunate. It's not an important issue (for me at least) but I thought I should open an issue. I don't think many people are mixing CuArrays and |
setindex!
with values of type Int128
(with Julia 1.6)
This looks like it's been always like this, at least the assertion gets triggered on ancient versions of LLVM. Reported upstream as https://bugs.llvm.org/show_bug.cgi?id=49877. |
Ah interesting. Just to double check that what I said was right I went back to Julia 1.5 and indeed the minimal working example works there: julia> using CUDA
julia> A = zeros(3) |> CuArray
3-element CuArray{Float64,1}:
0.0
0.0
0.0
julia> A .= UInt128(5)
3-element CuArray{Float64,1}:
5.0
5.0
5.0
julia> A
3-element CuArray{Float64,1}:
5.0
5.0
5.0 Details on Julia
Details on CUDA
Environment] status -m
|
Ah yes, this worked on 1.5 because we were then using source_filename = "text"
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
define void @kernel( [1 x i128]* byval([1 x i128]) ) {
ret void
} I had to disable that due to a performance regression; JuliaGPU/GPUCompiler.jl#92. Maybe it's time to revisit that, especially with https://reviews.llvm.org/D98469. |
Let's close this in favor of #974. |
Describe the bug
I get a segfault when using
setindex!
with values of typeInt128
.This only started happening with Julia 1.6, it's been fine with previous versions of CUDA.jl (even going back to CuArrays.jl).
To reproduce
The Minimal Working Example (MWE) for this bug:
produces this segault:
Manifest.toml
Expected behavior
I expected it to set every element of
A
to5.0
.Version info
Details on Julia:
Details on CUDA:
X-Ref: CliMA/Oceananigans.jl#1514
The text was updated successfully, but these errors were encountered: