Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending a dimension to a multiple of the outer loops can break dependencies #69

Open
RobinGeens opened this issue Dec 30, 2024 · 2 comments

Comments

@RobinGeens
Copy link
Contributor

RobinGeens commented Dec 30, 2024

tile_attrs = original_node.extract_node_attr()
for loop in outer_temporal_loops:
outer_dim, outer_size = loop.unpack()
node_dim_size: int = tile_attrs.layer_dim_sizes[outer_dim]
q, rem = divmod(node_dim_size, outer_size) # returns x//y, x%y
# Make sure that the outer_dim is divisible by the outer_size
if rem != 0:
# Pad the dimension to a multiple of outer_size
node_dim_size = (q + 1) * outer_size
q += 1
tile_attrs.layer_dim_sizes[outer_dim] = q

In this code block, the tile's layer_dim_sizes are reduced to exclude the outer loop size. When node_dim_size % outer_size != 0, the tile's size is padded to a multiple of outer_size.

When two nodes in the same layer stack have a different inter core tiling, it can happen that the same outer_dim is padded to different sizes in both layers, and as a result, the dependencies between nodes of the two layers can cross the boundary of the steady state group.

example:

  • layer 0 and 1 both have {D: 34} as layer_dim_sizes.
  • After intra core tiling of (D, 2), this becomes {D: 17}.
  • Layer 0 has inter core tiling (D, 4) and layer 1 has (D, 3).
  • The tile sizes will be padded to 20 and 18, as multiples of 4 and 3 respectively.

The dependencies between the tiles will look like this:
image
The red arrow causes scheduling problems.

To fix this, there are two options:

  1. The padded size is a multiple of the intra core tiling and all inter core tiling factors of the nodes in the stack
  2. Stream can deal the tile sizes not being a multiple of the inter and intra core tiling factors

Additional dependency issues
The dependency generation (using NodeTensor) always uses the original loop ranges and not the extended ones. When the dimensions of a layer are extended to multiples of the divisors and then tiles, it is possible that the last tiles have a loop range that falls completely outside of the original loop ranges. This tile will not be recorded in NodeTensor, and tiles of consecutive layers that rely on this tile will have an empty dependency instead.

To fix this, the dependency generation needs to know the updated sizes, but this is non-trivial.

@RobinGeens
Copy link
Contributor Author

I have added a semi-fix for the additional dependency issues: instead of losing the dependencies of the last tile(s) because it's loop ranges exceed the layer size, there will now always be a dependency to the very last element of the preceding layer.

@RobinGeens
Copy link
Contributor Author

I have implemented solution 1, with one problem left: different layers can have different dimension names even though they are the same tensor dimension

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant