Transposed matmul block packing #944

adam-smnk · 2024-07-17T10:53:55Z

Enables pack-matmul driver to handle transpose matmul variants.

PyTorch nn.Linear layer by default transposes matrix B (weights). This change allows accelerating also these workload by using upstream block packing capability to handle multiple matmul variants. After packing, matmuls are represented by the same generic format accepted by the current XSMM lowering.

A small performance hit compared to standard linalg.matmul is due to a linalg.transpose of a matmul_tranpose_a|b input matrix not being lowered to XSMM.

Enables pack-matmul driver to handle transpose matmul variants. PyTorch nn.Linear layer by default transposes matrix B (weights). This change allows accelerating also these workload by using upstream block packing capability to handle multiple matmul variants. After packing, matmuls are represented by the same generic format accepted by the current XSMM lowering. A small performance hit compared to standard linalg.matmul is due to a linalg.transpose of a matmul_tranpose_a|b input matrix not being lowered to XSMM.

adam-smnk · 2024-07-17T11:08:52Z

In general, our infrastructure can't fully handle matmul_transpose. However, by prepacking matmuls, which we do anyway, the transpose variants can be converted to a form compatible with the rest of our pipeline.
While it might not be optimal, it does speed up the transposed matmul variants and should allow for easier testing with OV.

lib/TPP/Transforms/ToBlockLayoutAndBack.cpp

test/Passes/DefaultPipeline/linalg-matmul-variants.mlir

adam-smnk added the benchmark Triggers benchmark jobs label Jul 17, 2024

adam-smnk requested a review from rengolin July 17, 2024 11:00

rengolin reviewed Jul 17, 2024

View reviewed changes

lib/TPP/Transforms/ToBlockLayoutAndBack.cpp Outdated Show resolved Hide resolved

test/Passes/DefaultPipeline/linalg-matmul-variants.mlir Show resolved Hide resolved

test/Passes/DefaultPipeline/linalg-matmul-variants.mlir Show resolved Hide resolved

Fix comment

3918d3b

rengolin approved these changes Jul 18, 2024

View reviewed changes

adam-smnk merged commit 4e7ad1a into plaidml:main Jul 18, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transposed matmul block packing #944

Transposed matmul block packing #944

adam-smnk commented Jul 17, 2024

adam-smnk commented Jul 17, 2024 •

edited

Loading

Transposed matmul block packing #944

Transposed matmul block packing #944

Conversation

adam-smnk commented Jul 17, 2024

adam-smnk commented Jul 17, 2024 • edited Loading

adam-smnk commented Jul 17, 2024 •

edited

Loading