You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To avoid this overhead, we need groupGEMM kernel as old CK, which is a persistent kernel that reads GEMM shapes from device memory, and calculate offset and block_id on-the-fly.
The text was updated successfully, but these errors were encountered:
Currently, CK Tile GroupGEMM prepares metadata on the host, which requires transferring meta data between the device and host back and forth.
composable_kernel/include/ck_tile/ops/gemm/kernel/grouped_gemm_kernel.hpp
Line 98 in 6b6fcd3
To avoid this overhead, we need groupGEMM kernel as old CK, which is a persistent kernel that reads GEMM shapes from device memory, and calculate offset and block_id on-the-fly.
The text was updated successfully, but these errors were encountered: