[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) #432

mgehre-amd · 2024-12-17T22:58:37Z

The commit 6e8c7be was later reverted due to issues; I backported the revert here to avoid debugging some failing tests.

… generic ops (llvm#3762) This is motivated by the fact that shapes are stored as tensors in ONNX, and IREE tries to perform tensor arithmetic on the device. This causes unnecessary dispatches, and makes it harder for the compiler to reason about shapes. Here is a small snippet of torch-IR that is typical seen coming from ONNX models: ```mlir module { func.func @main_graph(%arg0: !torch.vtensor<[?,?,768],f32>, %arg1: !torch.vtensor<[?,?,768],f32>) -> !torch.vtensor<[],si64> { %int0 = torch.constant.int 0 %0 = torch.vtensor.literal(dense<0> : tensor<1xsi64>) : !torch.vtensor<[1],si64> %1 = torch.aten._shape_as_tensor %arg1 : !torch.vtensor<[?,?,768],f32> -> !torch.vtensor<[3],si64> %2 = torch.aten.index_select %1, %int0, %0 : !torch.vtensor<[3],si64>, !torch.int, !torch.vtensor<[1],si64> -> !torch.vtensor<[1],si64> %3 = torch.aten.squeeze.dim %2, %int0 : !torch.vtensor<[1],si64>, !torch.int -> !torch.vtensor<[],si64> %4 = torch.aten.item %3 : !torch.vtensor<[],si64> -> !torch.int %5 = torch.aten.eq.int %4, %int0 : !torch.int, !torch.int -> !torch.bool %6 = torch.aten.Int.bool %5 : !torch.bool -> !torch.int %7 = torch.aten.size.int %arg0, %int0 : !torch.vtensor<[?,?,768],f32>, !torch.int -> !torch.int %8 = torch.prim.NumToTensor.Scalar %6 : !torch.int -> !torch.vtensor<[],i1> %9 = torch.prim.NumToTensor.Scalar %7 : !torch.int -> !torch.vtensor<[],si64> %10 = torch.prim.NumToTensor.Scalar %4 : !torch.int -> !torch.vtensor<[],si64> %11 = torch.aten.where.self %8, %9, %10 : !torch.vtensor<[],i1>, !torch.vtensor<[],si64>, !torch.vtensor<[],si64> -> !torch.vtensor<[],si64> return %11 : !torch.vtensor<[],si64> } } ``` Without the change in this PR, the result would be: ```mlir #map = affine_map<() -> ()> module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = tensor.empty() : tensor<i1> %6 = linalg.fill ins(%3 : i1) outs(%5 : tensor<i1>) -> tensor<i1> %7 = tensor.empty() : tensor<i64> %8 = linalg.fill ins(%4 : i64) outs(%7 : tensor<i64>) -> tensor<i64> %9 = linalg.fill ins(%extracted : i64) outs(%7 : tensor<i64>) -> tensor<i64> %10 = linalg.generic {indexing_maps = [#map, #map, #map, #map], iterator_types = []} ins(%6, %8, %9 : tensor<i1>, tensor<i64>, tensor<i64>) outs(%7 : tensor<i64>) { ^bb0(%in: i1, %in_1: i64, %in_2: i64, %out: i64): %11 = arith.select %in, %in_1, %in_2 : i64 linalg.yield %11 : i64 } -> tensor<i64> return %10 : tensor<i64> } } ``` With the change in this PR, we would instead get: ```mlir module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = arith.select %3, %4, %extracted : i64 %6 = tensor.empty() : tensor<i64> %7 = linalg.fill ins(%5 : i64) outs(%6 : tensor<i64>) -> tensor<i64> return %7 : tensor<i64> } } ``` Some related issues for context: 1. <iree-org/iree#18677> 2. <iree-org/iree#18631>

…e linalg generic ops (llvm#3762)" (llvm#3767) Reverted due to downstream model changes. Will reland with fixes post integration. This reverts commit 6e8c7be.

zjgarvey and others added 2 commits October 4, 2024 11:27

[AutoBump] Merge with fixes of 6e8c7be (Oct 04)

efc5746

Base automatically changed from bump_to_f08bfc4f to feature/backport_ea1_ops December 18, 2024 20:42

mgehre-amd changed the title ~~[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67)~~ [AutoBump] Merge with fixes of 6e8c7bed (Oct 04, needs LLVM bump) (67) Dec 18, 2024

mgehre-amd mentioned this pull request Dec 18, 2024

[AutoBump] Merge with fixes of f8eceb45 (Sep 17, needs torch bump) (6) Xilinx/llvm-project#420

Merged

Revert "[TorchToLinalg] perform rank0 elementwise computations outsid…

e1267ce

…e linalg generic ops (llvm#3762)" (llvm#3767) Reverted due to downstream model changes. Will reland with fixes post integration. This reverts commit 6e8c7be.

mgehre-amd force-pushed the bump_to_6e8c7bed branch from 15a443b to e1267ce Compare December 18, 2024 21:38

mgehre-amd requested a review from jorickert December 18, 2024 21:38

mgehre-amd changed the title ~~[AutoBump] Merge with fixes of 6e8c7bed (Oct 04, needs LLVM bump) (67)~~ [AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) Dec 18, 2024

mgehre-amd enabled auto-merge December 18, 2024 22:05

jorickert approved these changes Dec 19, 2024

View reviewed changes

mgehre-amd merged commit c4631dc into feature/backport_ea1_ops Dec 19, 2024
4 checks passed

mgehre-amd deleted the bump_to_6e8c7bed branch December 19, 2024 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) #432

[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) #432

mgehre-amd commented Dec 17, 2024 •

edited

Loading

[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) #432

[AutoBump] Merge with fixes of 6e8c7bed (Oct 04) (67) #432

Conversation

mgehre-amd commented Dec 17, 2024 • edited Loading

mgehre-amd commented Dec 17, 2024 •

edited

Loading