forked from iree-org/iree
-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[LLVMGPU] Enable WarpReduction on ROCM + Let matvec use Warp Reduce. (i…
…ree-org#15034) This patch does two things: 1.Enables Warp reduction config on ROCm 2.Mirror SPIR-V logic for letting Matvec go down subgroup/warp reduce pipeline.
- Loading branch information
1 parent
750784d
commit 1ba5e37
Showing
9 changed files
with
302 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
35 changes: 35 additions & 0 deletions
35
compiler/src/iree/compiler/Codegen/LLVMGPU/test/reduction_pipeline_rocm.mlir
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
// RUN: iree-opt --split-input-file --pass-pipeline="builtin.module(hal.executable(hal.executable.variant(builtin.module(func.func(iree-linalg-ext-decompose-softmax)), iree-llvmgpu-lower-executable-target)))" %s | FileCheck %s | ||
|
||
#pipeline_layout = #hal.pipeline.layout<push_constants = 0, sets = [ | ||
#hal.descriptor_set.layout<0, bindings = [ | ||
#hal.descriptor_set.binding<0, storage_buffer>, | ||
#hal.descriptor_set.binding<1, storage_buffer> | ||
]> | ||
]> | ||
hal.executable @softmax { | ||
hal.executable.variant @rocm, target = <"rocm", "rocm-hsaco-fb", {target_arch = "gfx1100"}> { | ||
hal.executable.export @softmax layout(#pipeline_layout) { | ||
^bb0(%arg0: !hal.device, %arg1: index, %arg2 : index): | ||
%x, %y, %z = flow.dispatch.workgroup_count_from_dag_root %arg1, %arg2 | ||
hal.return %x, %y, %z : index, index, index | ||
} | ||
builtin.module { | ||
func.func @softmax() { | ||
%c0 = arith.constant 0 : index | ||
%cst = arith.constant -3.40282347E+38 : f32 | ||
%cst_0 = arith.constant 0.000000e+00 : f32 | ||
%cst_1 = arith.constant 1.000000e+00 : f32 | ||
%0 = hal.interface.binding.subspan set(0) binding(0) type(storage_buffer) alignment(64) offset(%c0) : !flow.dispatch.tensor<readonly:tensor<12x128x40960xf32>> | ||
%1 = hal.interface.binding.subspan set(0) binding(1) type(storage_buffer) alignment(64) offset(%c0) : !flow.dispatch.tensor<writeonly:tensor<12x128x40960xf32>> | ||
%2 = flow.dispatch.tensor.load %0, offsets = [0, 0, 0], sizes = [12, 128, 40960], strides = [1, 1, 1] : !flow.dispatch.tensor<readonly:tensor<12x128x40960xf32>> -> tensor<12x128x40960xf32> | ||
%3 = tensor.empty() : tensor<12x128x40960xf32> | ||
%4 = iree_linalg_ext.softmax dimension(2) ins(%2 : tensor<12x128x40960xf32>) outs(%3 : tensor<12x128x40960xf32>) -> tensor<12x128x40960xf32> | ||
flow.dispatch.tensor.store %4, %1, offsets = [0, 0, 0], sizes = [12, 128, 40960], strides = [1, 1, 1] : tensor<12x128x40960xf32> -> !flow.dispatch.tensor<writeonly:tensor<12x128x40960xf32>> | ||
return | ||
} | ||
} | ||
} | ||
} | ||
|
||
// CHECK-LABEL: func.func @softmax | ||
// CHECK-COUNT-20: gpu.shuffle xor{{.*}}{{[[:space:]].*}}{{.*}} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.