Implement lowerings for argmin and argmax #58

nhat-nguyen · 2023-11-16T04:25:45Z

Triton argmin and argmax both lower to tt.reduce ops that have identical
semantics identical to linalg.reduce op, so we can clone tt.reduce body to
linalg.reduce directly. Unfortunately, we still need to perform pattern matching
to know what reduce ops we are dealing with so that we know how to initialize
the initial reduce values correctly.

We can do this in a generic way without pattern matching by always using
the first elements along the reduction axis and perform the reduction on
the remaining elements. However, this results in creatings sub-tensors that
aren't always multiple of 2s, which are sub-optimal for certain hardware.

nhat-nguyen added 2 commits November 16, 2023 11:08

Update

8f3734c

Update

d45357c

nhat-nguyen requested a review from manbearian November 16, 2023 07:59

manbearian approved these changes Nov 16, 2023

View reviewed changes

Merge branch 'main' into nhat/argminmax

3823b43

nhat-nguyen merged commit de797bb into main Nov 20, 2023
2 checks passed

nhat-nguyen deleted the nhat/argminmax branch November 20, 2023 17:05

yuanfz98 mentioned this pull request Nov 21, 2023

support more complex max/min f/i pattern matching #61

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement lowerings for argmin and argmax #58

Implement lowerings for argmin and argmax #58

nhat-nguyen commented Nov 16, 2023

Implement lowerings for argmin and argmax #58

Implement lowerings for argmin and argmax #58

Conversation

nhat-nguyen commented Nov 16, 2023