Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transposed matmul block packing #944

Merged
merged 2 commits into from
Jul 18, 2024

Conversation

adam-smnk
Copy link
Collaborator

Enables pack-matmul driver to handle transpose matmul variants.

PyTorch nn.Linear layer by default transposes matrix B (weights). This change allows accelerating also these workload by using upstream block packing capability to handle multiple matmul variants. After packing, matmuls are represented by the same generic format accepted by the current XSMM lowering.

A small performance hit compared to standard linalg.matmul is due to a linalg.transpose of a matmul_tranpose_a|b input matrix not being lowered to XSMM.

Enables pack-matmul driver to handle transpose matmul variants.

PyTorch nn.Linear layer by default transposes matrix B (weights).
This change allows accelerating also these workload by using upstream
block packing capability to handle multiple matmul variants.
After packing, matmuls are represented by the same generic format
accepted by the current XSMM lowering.

A small performance hit compared to standard linalg.matmul is due to
a linalg.transpose of a matmul_tranpose_a|b input matrix not being
lowered to XSMM.
@adam-smnk adam-smnk added the benchmark Triggers benchmark jobs label Jul 17, 2024
@adam-smnk adam-smnk requested a review from rengolin July 17, 2024 11:00
@adam-smnk
Copy link
Collaborator Author

adam-smnk commented Jul 17, 2024

In general, our infrastructure can't fully handle matmul_transpose. However, by prepacking matmuls, which we do anyway, the transpose variants can be converted to a form compatible with the rest of our pipeline.
While it might not be optimal, it does speed up the transposed matmul variants and should allow for easier testing with OV.

@adam-smnk adam-smnk merged commit 4e7ad1a into plaidml:main Jul 18, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Triggers benchmark jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants