-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSL memory layouts #47
Conversation
b7bc0d1
to
252f7ae
Compare
dab0bf8
to
aac5de4
Compare
bd89f8e
to
1e240f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand everything I'm afraid, can we discuss offline?
Cool PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering whether we should include negative tests for the parser here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
assert lccb2[1].stride == 4 | ||
assert lccb2[1].bound == 4 | ||
|
||
lccb3 = tsl1.largest_common_contiguous_block(tsl3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there's a dynamic tile size in there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still works, this case is included in the tests!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just stops searching for a larger contiguous block as soon as it hits a dynamic shape
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments still 😄
if len(strides) != len(bounds): | ||
raise ParseError("Expected same number of strides and bounds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test these invariants
I resolved all your comments! XDSL tries to parse arguments (inclosed in
When parsing attr, my own errors will be thrown. This code will then try to just parse
However, when testing with parsing_diagnostics, only the last line of the error is printed, and I cannot check for my own errors.
|
76965f8
to
d50a112
Compare
🎉 |
* add tsl layout * add dialect implementation * remove old files * remove old files * re-enable python tests * add ir implementation * add parser * delete old tests * add simple filecheck * undo change * redo change * resolv own comments * add offsets * Add dynamic stride and bound support * add readme * Update README.md * Update README.md * change TSL notation * fix python test * stride is now step but stride is still stride * add starting stride * change constructor ordering * add negative parsing checks
TSL Memory Layouts
Rationale
A TSL (tiled-strided-layout) memory layout is an MLIR attribute, designed to be used as the
layout
parameter for amemref
type. A TSL layout tiles the data and defines a stride for every tile, allowing for flexible memory layouts especially suited for hardware accelerators. This layout adds tiling to the existingStridedLayoutAttr
. While theAffineMapLayoutAttr
allows for a tiled layout, the representation is not always clear, and more importantly does not allow for non-contiguity, which may be required to maximally exploit the full bandwidth of the memory.Notation
We employ the following notation for TSL attributes: (for a 2D matrix and one level of tiling), where the bounds and strides are ordered from outermost -> innermost
[bound, bound] -> (stride, stride), [bound, bound] -> (stride, stride)
Consider the following memory layout:
The image represents an
8x8
matrix, where every digit represents the memory address where the element will be stored.<img src="https://github.com/KULeuven-MICAS/snax-mlir/assets/47864363/6d03debe-888e-4e5f-82c2-040434bc1f99 " width="400">
In both dimensions, the data is tiled in 2 tiles of size 4, this information is represented with the tiling bounds:
[2, 4] -> (stride, stride), [2, 4] -> (stride, stride)
For the first dimension there is a stride within the tile of 4 and across tiles of 32:
[2, 4] -> (32, 4), [4, 2] -> (stride, stride)
For the second dimension there is a stride within the tile of 1 and across tiles of 16:
[2, 4] -> (32, 4), [2, 4] -> (16, 1)
Additionally, the full TSL layout attribute can also include a base memory offset:
#tsl.tsl<[2, 4] -> (32, 4), [2, 4] -> (16, 1), offset: 5>
When no offset is defined, it is assumed to be 0
Dynamic Sizes
The layout provided allows for some flexibility in defining dynamic shapes within a matrix:
#tsl.tsl<[?, 4] -> (32, 4), [?, 4] -> (?, 1)>
The key point is that only the outermost tile is allowed to have dynamic sizes; the sizes and strides of the inner tiles must remain fixed. In the example, the fixed tile sizes are set to 4x4, with strides of 4 and 1. Additionally, there's one extra stride of 32, causing the tiles to be spaced at intervals of 32. The determination of the other strides, once the full matrix dimensions are known, is not yet determined. However, a likely approach is to densely determine the strides from left to right.
For example, if dealing with a 64x64 matrix, the layout would be adjusted accordingly:
#tsl.tsl<[16, 4] -> (32, 4), [16, 4] -> (?, 1)>
Here, the missing stride is calculated as 32x16=512. This adjustment ensures that the dynamic shapes remain consistent with the fixed tile sizes and strides while accommodating the overall matrix dimensions.