-
Notifications
You must be signed in to change notification settings - Fork 31
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Linalg to WMMA lowering rework (#871)
Reworks linalg lowering to WMMA ops to support computations on memory tiles larger than hardware WMMA tiles. The new lowering effectively applies warp (or subgroup) tiling on 2D tiled linalg GEMM-like operations. Additionally, the reduction dimension tiling is performed at the same time to ensure that the computation fits within limits of hardware resources. The elementwise consumers are fused with GEMM when possible. The warp tiling is done at the time of linalg lowering as the parameter decisions, such as tile sizes, are driven by the compute workload. Mapping first to individual WMMA operations and splitting them later into multiple ops operating on smaller sub-tiles would be more complex due to need for use-chain analysis and extra validation.
- Loading branch information
Showing
9 changed files
with
685 additions
and
249 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.