[BUG] [Metadata] Incorrect kernel argument types when NUMBA_DUMP_LLVM is turned on #135

jiel-nv · 2025-02-25T00:41:37Z

Describe the bug
When NUMBA_DUMP_LLVM is turned on, the debug metadata for kernel arguments still contains return value which should not be in the DISubroutineType.

Steps/Code to reproduce bug

$ cat bug.py
from numba import cuda

@cuda.jit("void(int32, int32)", debug=True, opt=False)
def f(x, y):
    z = x + y
$ NUMBA_DUMP_LLVM=1 python bug.py | grep "\!8"
!8 = !{ !5, !6, !7 }
!9 = !DISubroutineType(types: !8)
!8 = !{ !5, !6, !7 }
!9 = !DISubroutineType(types: !8)

Notice that, even after the "post kernel fixup" dump, the return value still shows up in the DISubroutineType, which was fixed by a previous PR (without the NUMBA_DUMP_LLVM turned on)

Expected behavior
If not using the NUMBA_DUMP_LLVM, and simply use the inspect_llvm() method, we do get the expected metadata as following,

$ cat bug.py
from numba import cuda
from numba.core import types

sig = (types.int32, types.int32)
@cuda.jit("void(int32, int32)", debug=True, opt=False)
def f(x, y):
    z = x + y

llvm_ir = f.inspect_llvm(sig)
llvmir = f.inspect_llvm(sig).splitlines()
for line in llvmir:
    print(line)
$ python bug.py | grep "\!8"
!8 = !{ !6, !7 }
!9 = !DISubroutineType(types: !8)

So something wrong happens when turning on NUMBA_DUMP_LLVM.

The text was updated successfully, but these errors were encountered:

jiel-nv · 2025-02-25T06:49:48Z

Another observation, taking the inspect_llvm() version of bug.py, if turning on NUMBA_DUMP_LLVM, the result of inspect_llvm() will also be wrong, i.e. NUMBA_DUMP_LLVM dominates in this case.

$ NUMBA_DUMP_LLVM=0 python bug.py | grep "\!8"
!8 = !{ !6, !7 }
!9 = !DISubroutineType(types: !8)
$ NUMBA_DUMP_LLVM=1 python bug.py | grep "\!8"
!8 = !{ !5, !6, !7 }
!9 = !DISubroutineType(types: !8)
!8 = !{ !5, !6, !7 }
!9 = !DISubroutineType(types: !8)
!8 = !{ !5, !6, !7 }
!9 = !DISubroutineType(types: !8)

…VM is on. After shifting out the return value type from the operands of 'types' field of the 'DISubroutineType' metadata node, compiler needs to clear the cached string representation of the corresponding MDValue object if NUMBA_DUMP_LLVM is on. Because at that moment, the cached string has already been filled with the tuple before kernel_fixup() which has the return value type and continue being used when printing out the LLVM IR. This change fixes NVIDIA#135

jiel-nv added the bug Something isn't working label Feb 25, 2025

jiel-nv linked a pull request Feb 25, 2025 that will close this issue

Emit correct kernel arguments debug metadata types when NUMBA_DUMP_LLVM is on #136

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] [Metadata] Incorrect kernel argument types when NUMBA_DUMP_LLVM is turned on #135

[BUG] [Metadata] Incorrect kernel argument types when NUMBA_DUMP_LLVM is turned on #135

jiel-nv commented Feb 25, 2025

jiel-nv commented Feb 25, 2025

[BUG] [Metadata] Incorrect kernel argument types when NUMBA_DUMP_LLVM is turned on #135

[BUG] [Metadata] Incorrect kernel argument types when NUMBA_DUMP_LLVM is turned on #135

Comments

jiel-nv commented Feb 25, 2025

jiel-nv commented Feb 25, 2025