Discrepancy in AOT Module Size and Runtime Efficiency Based on Kernel Execution State #8564

Roushelfy · 2024-07-15T17:48:04Z

Issue Description

Summary

When generating an AOT module using Taichi, I observed a difference in the size of the generated module.tcm file depending on whether the kernel function was executed before archiving. This discrepancy also affects the runtime efficiency when the module is loaded and launched from C++/ C#.

Minimal Sample Code to Reproduce

import taichi as ti

def compile_aot(run=False):
    ti.init(arch=ti.vulkan)
    if ti.lang.impl.current_cfg().arch != ti.vulkan:
        raise RuntimeError("Vulkan is not available.")
    
    @ti.kernel
    def paint(pixels: ti.types.ndarray(dtype=ti.f32, ndim=2), n: ti.u32, t: ti.f32):
        for i, j in pixels:  # Parallelized over all pixels
            c = ti.Vector([-0.8, ti.cos(t) * 0.2])
            z = ti.Vector([i / n - 1, j / n - 0.5]) * 2
            iterations = 0
            while z.norm() < 20 and iterations < 50:
                z = ti.Vector([z[0]**2 - z[1]**2, z[1] * z[0] * 2]) + c
                iterations += 1
            pixels[i, j] = 1 - iterations * 0.02

    n = 1024
    t = 0
    pixels = ti.ndarray(shape=(n * 2, n), dtype=ti.f32)
    if run:
        gui = ti.GUI('Julia Set', (n * 2, n))

        while gui.running:
            t += 1
            paint(pixels, n, t * 0.03)
            pixel = pixels.to_numpy()
            gui.set_image(pixel)
            gui.show()
    
    mod = ti.aot.Module(ti.vulkan)
    mod.add_kernel(paint, template_args={'pixels': pixels})
    mod.archive("build/module.tcm")
    print("Module archived to 'build/module.tcm'")

if __name__ == '__main__':
    compile_aot(run=False)

Observations

When run=False, the generated module.tcm file is 7 KB.

When run=True, the generated module.tcm file is 8 KB.

The runtime efficiency when the module is called from C++/C# differs between the two cases.(If the kernel function was executed before archiving, it runs faster)

Questions

What causes the difference in the size of the AOT module depending on whether the kernel function is executed before archiving?
Why does this difference impact the runtime efficiency when the module is called from C++?

System Information

Taichi version: [1.8.0]
OS: [WIN 11]
Thank you for your help in understanding this issue.

The text was updated successfully, but these errors were encountered:

Roushelfy added the question Question on using Taichi label Jul 15, 2024

taichi-gardener added this to Taichi Lang Jul 15, 2024

github-project-automation bot moved this to Untriaged in Taichi Lang Jul 15, 2024

Roushelfy closed this as completed Jul 19, 2024

github-project-automation bot moved this from Untriaged to Done in Taichi Lang Jul 19, 2024

Roushelfy reopened this Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy in AOT Module Size and Runtime Efficiency Based on Kernel Execution State #8564

Discrepancy in AOT Module Size and Runtime Efficiency Based on Kernel Execution State #8564

Roushelfy commented Jul 15, 2024

Discrepancy in AOT Module Size and Runtime Efficiency Based on Kernel Execution State #8564

Discrepancy in AOT Module Size and Runtime Efficiency Based on Kernel Execution State #8564

Comments

Roushelfy commented Jul 15, 2024

Issue Description

Summary

Minimal Sample Code to Reproduce

Observations

Questions

System Information