Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSL memory layouts #47

Merged
merged 23 commits into from
Jan 16, 2024
Merged

TSL memory layouts #47

merged 23 commits into from
Jan 16, 2024

Conversation

jorendumoulin
Copy link
Contributor

@jorendumoulin jorendumoulin commented Dec 21, 2023

TSL Memory Layouts

Rationale

A TSL (tiled-strided-layout) memory layout is an MLIR attribute, designed to be used as the layout parameter for a memref type. A TSL layout tiles the data and defines a stride for every tile, allowing for flexible memory layouts especially suited for hardware accelerators. This layout adds tiling to the existing StridedLayoutAttr. While the AffineMapLayoutAttr allows for a tiled layout, the representation is not always clear, and more importantly does not allow for non-contiguity, which may be required to maximally exploit the full bandwidth of the memory.

Notation

We employ the following notation for TSL attributes: (for a 2D matrix and one level of tiling), where the bounds and strides are ordered from outermost -> innermost

[bound, bound] -> (stride, stride), [bound, bound] -> (stride, stride)

Consider the following memory layout:
The image represents an 8x8 matrix, where every digit represents the memory address where the element will be stored.

<img src="https://github.com/KULeuven-MICAS/snax-mlir/assets/47864363/6d03debe-888e-4e5f-82c2-040434bc1f99 " width="400">

In both dimensions, the data is tiled in 2 tiles of size 4, this information is represented with the tiling bounds:

[2, 4] -> (stride, stride), [2, 4] -> (stride, stride)

For the first dimension there is a stride within the tile of 4 and across tiles of 32:

[2, 4] -> (32, 4), [4, 2] -> (stride, stride)

For the second dimension there is a stride within the tile of 1 and across tiles of 16:

[2, 4] -> (32, 4), [2, 4] -> (16, 1)

Additionally, the full TSL layout attribute can also include a base memory offset:

#tsl.tsl<[2, 4] -> (32, 4), [2, 4] -> (16, 1), offset: 5>

When no offset is defined, it is assumed to be 0

Dynamic Sizes

The layout provided allows for some flexibility in defining dynamic shapes within a matrix:

#tsl.tsl<[?, 4] -> (32, 4), [?, 4] -> (?, 1)>

The key point is that only the outermost tile is allowed to have dynamic sizes; the sizes and strides of the inner tiles must remain fixed. In the example, the fixed tile sizes are set to 4x4, with strides of 4 and 1. Additionally, there's one extra stride of 32, causing the tiles to be spaced at intervals of 32. The determination of the other strides, once the full matrix dimensions are known, is not yet determined. However, a likely approach is to densely determine the strides from left to right.

For example, if dealing with a 64x64 matrix, the layout would be adjusted accordingly:

#tsl.tsl<[16, 4] -> (32, 4), [16, 4] -> (?, 1)>

Here, the missing stride is calculated as 32x16=512. This adjustment ensures that the dynamic shapes remain consistent with the fixed tile sizes and strides while accommodating the overall matrix dimensions.

@jorendumoulin jorendumoulin marked this pull request as ready for review January 4, 2024 09:28
@jorendumoulin jorendumoulin marked this pull request as draft January 8, 2024 08:23
@jorendumoulin jorendumoulin marked this pull request as ready for review January 9, 2024 11:06
Copy link
Contributor

@JosseVanDelm JosseVanDelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand everything I'm afraid, can we discuss offline?

Cool PR!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering whether we should include negative tests for the parser here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

assert lccb2[1].stride == 4
assert lccb2[1].bound == 4

lccb3 = tsl1.largest_common_contiguous_block(tsl3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's a dynamic tile size in there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still works, this case is included in the tests!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just stops searching for a larger contiguous block as soon as it hits a dynamic shape

Copy link
Contributor

@JosseVanDelm JosseVanDelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments still 😄

Comment on lines +51 to +52
if len(strides) != len(bounds):
raise ParseError("Expected same number of strides and bounds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test these invariants

@jorendumoulin
Copy link
Contributor Author

I resolved all your comments!
Just the negative parsing checks come with some issues:
I included some negative tests, but as you may see, they always check for the following line: Expected: '>', instead of the actual error thrown.

XDSL tries to parse arguments (inclosed in <attr>) the following:

def in_angle_brackets(self):
        self.parse_punctuation("<")
        try:
            yield
        finally:
            self.parse_punctuation(">")

When parsing attr, my own errors will be thrown. This code will then try to just parse >, but the attr has not been parsed yet, thus the error looks something like this:

blablabla
Error thrown: my own error
blablabla
...
Error thrown: Expected >

However, when testing with parsing_diagnostics, only the last line of the error is printed, and I cannot check for my own errors.

  %0 = "test.op"() : () -> memref<64x64xindex, #tsl.tsl<[a, b] -> (8, 1), [16, 4] -> (256, 64), offset: 5>, 2 : i32>
                                                         ^
                                                         Expected '>'

@jorendumoulin jorendumoulin merged commit f5e4e72 into main Jan 16, 2024
4 checks passed
@JosseVanDelm
Copy link
Contributor

🎉

@jorendumoulin jorendumoulin deleted the Joren/memory-layouts branch January 17, 2024 07:37
jorendumoulin added a commit that referenced this pull request Jan 22, 2024
* add tsl layout

* add dialect implementation

* remove old files

* remove old files

* re-enable python tests

* add ir implementation

* add parser

* delete old tests

* add simple filecheck

* undo change

* redo change

* resolv own comments

* add offsets

* Add dynamic stride and bound support

* add readme

* Update README.md

* Update README.md

* change TSL notation

* fix python test

* stride is now step but stride is still stride

* add starting stride

* change constructor ordering

* add negative parsing checks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants