[RISCV] Set a barrier between mask producer and user of V0 #114012

wangpc-pp · 2024-10-29T07:35:02Z

Here we add a scheduling mutation in pre-ra scheduling, which will
add an artificial dependency edge between mask producer and its
previous nearest instruction that uses V0 register.

This prevents the overlap of live intervals of mask registers and
as a consequence we can reduce some spills/moves.

From the test changes, we can see some improvements and also some
regressions (more vtype toggles).

Partially fixes #113489.

llvmbot · 2024-10-29T07:35:36Z

@llvm/pr-subscribers-backend-risc-v

Author: Pengcheng Wang (wangpc-pp)

Changes

Here we add a scheduling mutation in pre-ra scheduling, which will
adds an artificial dependency edge between mask producer and its
previous nearest instruction that uses V0 register.

This prevents making live intervals of mask registers longer and as
a consequence we can reduce some spills/moves.

From the test changes, we can see some improvements and also some
regressions (more vtype toggles).

Partially fixes #113489.

Patch is 435.60 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114012.diff

33 Files Affected:

(modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+6)
(modified) llvm/lib/Target/RISCV/RISCVTargetMachine.h (+4)
(added) llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp (+102)
(modified) llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll (+8-10)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll (+108-106)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum.ll (+87-99)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll (+108-106)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum.ll (+87-99)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access-zve32x.ll (+32-38)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll (+148-256)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+6-9)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-sdnode.ll (+355-232)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll (+205-231)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-sdnode.ll (+355-232)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll (+205-231)
(modified) llvm/test/CodeGen/RISCV/rvv/sshl_sat_vec.ll (+33-41)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfeq.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfge.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfgt.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfle.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmflt.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfne.ll (+30-42)
(modified) llvm/test/CodeGen/RISCV/rvv/vmseq.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsge.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgeu.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgt.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgtu.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsle.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsleu.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmslt.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsltu.ll (+44-62)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsne.ll (+44-62)

diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index fd049d1a57860e..b95ad9dd428cc9 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -58,6 +58,7 @@ add_llvm_target(RISCVCodeGen
   RISCVTargetMachine.cpp
   RISCVTargetObjectFile.cpp
   RISCVTargetTransformInfo.cpp
+  RISCVVectorMaskDAGMutation.cpp
   RISCVVectorPeephole.cpp
   RISCVVLOptimizer.cpp
   RISCVZacasABIFix.cpp
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 089dc6c529193d..b88bd18e7c8585 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -360,6 +360,12 @@ class RISCVPassConfig : public TargetPassConfig {
       DAG->addMutation(createStoreClusterDAGMutation(
           DAG->TII, DAG->TRI, /*ReorderWhileClustering=*/true));
     }
+
+    const RISCVSubtarget &ST = C->MF->getSubtarget<RISCVSubtarget>();
+    if (ST.hasVInstructions()) {
+      DAG = DAG ? DAG : createGenericSchedLive(C);
+      DAG->addMutation(createRISCVVectorMaskDAGMutation(DAG->TRI));
+    }
     return DAG;
   }
 
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.h b/llvm/lib/Target/RISCV/RISCVTargetMachine.h
index ce7b7907e1f3af..1a37891f847ae6 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.h
@@ -61,6 +61,10 @@ class RISCVTargetMachine : public LLVMTargetMachine {
                                 SMRange &SourceRange) const override;
   void registerPassBuilderCallbacks(PassBuilder &PB) override;
 };
+
+std::unique_ptr<ScheduleDAGMutation>
+createRISCVVectorMaskDAGMutation(const TargetRegisterInfo *TRI);
+
 } // namespace llvm
 
 #endif
diff --git a/llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp b/llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp
new file mode 100644
index 00000000000000..5bdfdd696dd627
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp
@@ -0,0 +1,102 @@
+//===- RISCVVectorMaskDAGMutation.cpp - RISCV Vector Mask DAGMutation -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// A schedule mutation that add a dependency between masks producing
+// instructions and masked instructions, so that we will extend the live
+// interval of mask register.
+//
+//===----------------------------------------------------------------------===//
+
+#include "MCTargetDesc/RISCVMCTargetDesc.h"
+#include "RISCVTargetMachine.h"
+#include "llvm/CodeGen/MachineInstr.h"
+#include "llvm/CodeGen/ScheduleDAGInstrs.h"
+#include "llvm/CodeGen/ScheduleDAGMutation.h"
+
+#define DEBUG_TYPE "machine-scheduler"
+
+namespace llvm {
+
+static inline bool isVectorMaskProducer(const MachineInstr *MI) {
+  switch (RISCV::getRVVMCOpcode(MI->getOpcode())) {
+  // Vector Mask Instructions
+  case RISCV::VMAND_MM:
+  case RISCV::VMNAND_MM:
+  case RISCV::VMANDN_MM:
+  case RISCV::VMXOR_MM:
+  case RISCV::VMOR_MM:
+  case RISCV::VMNOR_MM:
+  case RISCV::VMORN_MM:
+  case RISCV::VMXNOR_MM:
+  case RISCV::VMSBF_M:
+  case RISCV::VMSIF_M:
+  case RISCV::VMSOF_M:
+  case RISCV::VIOTA_M:
+  // Vector Integer Compare Instructions
+  case RISCV::VMSEQ_VV:
+  case RISCV::VMSEQ_VX:
+  case RISCV::VMSEQ_VI:
+  case RISCV::VMSNE_VV:
+  case RISCV::VMSNE_VX:
+  case RISCV::VMSNE_VI:
+  case RISCV::VMSLT_VV:
+  case RISCV::VMSLT_VX:
+  case RISCV::VMSLTU_VV:
+  case RISCV::VMSLTU_VX:
+  case RISCV::VMSLE_VV:
+  case RISCV::VMSLE_VX:
+  case RISCV::VMSLE_VI:
+  case RISCV::VMSLEU_VV:
+  case RISCV::VMSLEU_VX:
+  case RISCV::VMSLEU_VI:
+  case RISCV::VMSGTU_VX:
+  case RISCV::VMSGTU_VI:
+  case RISCV::VMSGT_VX:
+  case RISCV::VMSGT_VI:
+  // Vector Floating-Point Compare Instructions
+  case RISCV::VMFEQ_VV:
+  case RISCV::VMFEQ_VF:
+  case RISCV::VMFNE_VV:
+  case RISCV::VMFNE_VF:
+  case RISCV::VMFLT_VV:
+  case RISCV::VMFLT_VF:
+  case RISCV::VMFLE_VV:
+  case RISCV::VMFLE_VF:
+  case RISCV::VMFGT_VF:
+  case RISCV::VMFGE_VF:
+    return true;
+  }
+  return false;
+}
+
+class RISCVVectorMaskDAGMutation : public ScheduleDAGMutation {
+private:
+  const TargetRegisterInfo *TRI;
+
+public:
+  RISCVVectorMaskDAGMutation(const TargetRegisterInfo *TRI) : TRI(TRI) {}
+
+  void apply(ScheduleDAGInstrs *DAG) override {
+    SUnit *NearestUseV0SU = nullptr;
+    for (SUnit &SU : DAG->SUnits) {
+      const MachineInstr *MI = SU.getInstr();
+      if (MI->findRegisterUseOperand(RISCV::V0, TRI))
+        NearestUseV0SU = &SU;
+
+      if (NearestUseV0SU && NearestUseV0SU != &SU && isVectorMaskProducer(MI))
+        DAG->addEdge(&SU, SDep(NearestUseV0SU, SDep::Artificial));
+    }
+  }
+};
+
+std::unique_ptr<ScheduleDAGMutation>
+createRISCVVectorMaskDAGMutation(const TargetRegisterInfo *TRI) {
+  return std::make_unique<RISCVVectorMaskDAGMutation>(TRI);
+}
+
+} // namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll b/llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll
index 7839b602706db1..113154c0f9855b 100644
--- a/llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/constant-folding-crash.ll
@@ -19,19 +19,18 @@ define void @constant_folding_crash(ptr %v54, <4 x ptr> %lanes.a, <4 x ptr> %lan
 ; RV32-LABEL: constant_folding_crash:
 ; RV32:       # %bb.0: # %entry
 ; RV32-NEXT:    lw a0, 8(a0)
+; RV32-NEXT:    vmv1r.v v10, v0
 ; RV32-NEXT:    andi a0, a0, 1
 ; RV32-NEXT:    seqz a0, a0
 ; RV32-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
-; RV32-NEXT:    vmv.v.x v10, a0
-; RV32-NEXT:    vmsne.vi v10, v10, 0
-; RV32-NEXT:    vmv1r.v v11, v0
-; RV32-NEXT:    vmv1r.v v0, v10
+; RV32-NEXT:    vmv.v.x v11, a0
+; RV32-NEXT:    vmsne.vi v0, v11, 0
 ; RV32-NEXT:    vsetvli zero, zero, e32, m1, ta, ma
 ; RV32-NEXT:    vmerge.vvm v8, v9, v8, v0
 ; RV32-NEXT:    vmv.x.s a0, v8
 ; RV32-NEXT:    vsetvli zero, zero, e8, mf4, ta, ma
 ; RV32-NEXT:    vmv.v.i v8, 0
-; RV32-NEXT:    vmv1r.v v0, v11
+; RV32-NEXT:    vmv1r.v v0, v10
 ; RV32-NEXT:    vmerge.vim v8, v8, 1, v0
 ; RV32-NEXT:    vrgather.vi v9, v8, 0
 ; RV32-NEXT:    vmsne.vi v0, v9, 0
@@ -43,19 +42,18 @@ define void @constant_folding_crash(ptr %v54, <4 x ptr> %lanes.a, <4 x ptr> %lan
 ; RV64-LABEL: constant_folding_crash:
 ; RV64:       # %bb.0: # %entry
 ; RV64-NEXT:    ld a0, 8(a0)
+; RV64-NEXT:    vmv1r.v v12, v0
 ; RV64-NEXT:    andi a0, a0, 1
 ; RV64-NEXT:    seqz a0, a0
 ; RV64-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
-; RV64-NEXT:    vmv.v.x v12, a0
-; RV64-NEXT:    vmsne.vi v12, v12, 0
-; RV64-NEXT:    vmv1r.v v13, v0
-; RV64-NEXT:    vmv1r.v v0, v12
+; RV64-NEXT:    vmv.v.x v13, a0
+; RV64-NEXT:    vmsne.vi v0, v13, 0
 ; RV64-NEXT:    vsetvli zero, zero, e64, m2, ta, ma
 ; RV64-NEXT:    vmerge.vvm v8, v10, v8, v0
 ; RV64-NEXT:    vmv.x.s a0, v8
 ; RV64-NEXT:    vsetvli zero, zero, e8, mf4, ta, ma
 ; RV64-NEXT:    vmv.v.i v8, 0
-; RV64-NEXT:    vmv1r.v v0, v13
+; RV64-NEXT:    vmv1r.v v0, v12
 ; RV64-NEXT:    vmerge.vim v8, v8, 1, v0
 ; RV64-NEXT:    vrgather.vi v9, v8, 0
 ; RV64-NEXT:    vmsne.vi v0, v9, 0
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll
index 51eb63f5f92212..216300b23f4524 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll
@@ -52,11 +52,10 @@ define <2 x half> @vfmax_vv_v2f16_unmasked(<2 x half> %va, <2 x half> %vb, i32 z
 ; ZVFH:       # %bb.0:
 ; ZVFH-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; ZVFH-NEXT:    vmfeq.vv v0, v8, v8
-; ZVFH-NEXT:    vmfeq.vv v10, v9, v9
-; ZVFH-NEXT:    vmerge.vvm v11, v8, v9, v0
-; ZVFH-NEXT:    vmv1r.v v0, v10
+; ZVFH-NEXT:    vmerge.vvm v10, v8, v9, v0
+; ZVFH-NEXT:    vmfeq.vv v0, v9, v9
 ; ZVFH-NEXT:    vmerge.vvm v8, v9, v8, v0
-; ZVFH-NEXT:    vfmax.vv v8, v8, v11
+; ZVFH-NEXT:    vfmax.vv v8, v8, v10
 ; ZVFH-NEXT:    ret
 ;
 ; ZVFHMIN-LABEL: vfmax_vv_v2f16_unmasked:
@@ -66,12 +65,11 @@ define <2 x half> @vfmax_vv_v2f16_unmasked(<2 x half> %va, <2 x half> %vb, i32 z
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vmfeq.vv v0, v10, v10
 ; ZVFHMIN-NEXT:    vsetivli zero, 2, e16, mf4, ta, ma
-; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
+; ZVFHMIN-NEXT:    vfwcvt.f.f.v v8, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, mf2, ta, ma
-; ZVFHMIN-NEXT:    vmfeq.vv v8, v11, v11
-; ZVFHMIN-NEXT:    vmerge.vvm v9, v10, v11, v0
-; ZVFHMIN-NEXT:    vmv1r.v v0, v8
-; ZVFHMIN-NEXT:    vmerge.vvm v8, v11, v10, v0
+; ZVFHMIN-NEXT:    vmerge.vvm v9, v10, v8, v0
+; ZVFHMIN-NEXT:    vmfeq.vv v0, v8, v8
+; ZVFHMIN-NEXT:    vmerge.vvm v8, v8, v10, v0
 ; ZVFHMIN-NEXT:    vfmax.vv v9, v8, v9
 ; ZVFHMIN-NEXT:    vsetivli zero, 2, e16, mf4, ta, ma
 ; ZVFHMIN-NEXT:    vfncvt.f.f.w v8, v9
@@ -124,11 +122,10 @@ define <4 x half> @vfmax_vv_v4f16_unmasked(<4 x half> %va, <4 x half> %vb, i32 z
 ; ZVFH:       # %bb.0:
 ; ZVFH-NEXT:    vsetvli zero, a0, e16, mf2, ta, ma
 ; ZVFH-NEXT:    vmfeq.vv v0, v8, v8
-; ZVFH-NEXT:    vmfeq.vv v10, v9, v9
-; ZVFH-NEXT:    vmerge.vvm v11, v8, v9, v0
-; ZVFH-NEXT:    vmv1r.v v0, v10
+; ZVFH-NEXT:    vmerge.vvm v10, v8, v9, v0
+; ZVFH-NEXT:    vmfeq.vv v0, v9, v9
 ; ZVFH-NEXT:    vmerge.vvm v8, v9, v8, v0
-; ZVFH-NEXT:    vfmax.vv v8, v8, v11
+; ZVFH-NEXT:    vfmax.vv v8, v8, v10
 ; ZVFH-NEXT:    ret
 ;
 ; ZVFHMIN-LABEL: vfmax_vv_v4f16_unmasked:
@@ -138,12 +135,11 @@ define <4 x half> @vfmax_vv_v4f16_unmasked(<4 x half> %va, <4 x half> %vb, i32 z
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
 ; ZVFHMIN-NEXT:    vmfeq.vv v0, v10, v10
 ; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
-; ZVFHMIN-NEXT:    vfwcvt.f.f.v v11, v9
+; ZVFHMIN-NEXT:    vfwcvt.f.f.v v8, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
-; ZVFHMIN-NEXT:    vmfeq.vv v8, v11, v11
-; ZVFHMIN-NEXT:    vmerge.vvm v9, v10, v11, v0
-; ZVFHMIN-NEXT:    vmv.v.v v0, v8
-; ZVFHMIN-NEXT:    vmerge.vvm v8, v11, v10, v0
+; ZVFHMIN-NEXT:    vmerge.vvm v9, v10, v8, v0
+; ZVFHMIN-NEXT:    vmfeq.vv v0, v8, v8
+; ZVFHMIN-NEXT:    vmerge.vvm v8, v8, v10, v0
 ; ZVFHMIN-NEXT:    vfmax.vv v9, v8, v9
 ; ZVFHMIN-NEXT:    vsetivli zero, 4, e16, mf2, ta, ma
 ; ZVFHMIN-NEXT:    vfncvt.f.f.w v8, v9
@@ -198,11 +194,10 @@ define <8 x half> @vfmax_vv_v8f16_unmasked(<8 x half> %va, <8 x half> %vb, i32 z
 ; ZVFH:       # %bb.0:
 ; ZVFH-NEXT:    vsetvli zero, a0, e16, m1, ta, ma
 ; ZVFH-NEXT:    vmfeq.vv v0, v8, v8
-; ZVFH-NEXT:    vmfeq.vv v10, v9, v9
-; ZVFH-NEXT:    vmerge.vvm v11, v8, v9, v0
-; ZVFH-NEXT:    vmv.v.v v0, v10
+; ZVFH-NEXT:    vmerge.vvm v10, v8, v9, v0
+; ZVFH-NEXT:    vmfeq.vv v0, v9, v9
 ; ZVFH-NEXT:    vmerge.vvm v8, v9, v8, v0
-; ZVFH-NEXT:    vfmax.vv v8, v8, v11
+; ZVFH-NEXT:    vfmax.vv v8, v8, v10
 ; ZVFH-NEXT:    ret
 ;
 ; ZVFHMIN-LABEL: vfmax_vv_v8f16_unmasked:
@@ -214,11 +209,10 @@ define <8 x half> @vfmax_vv_v8f16_unmasked(<8 x half> %va, <8 x half> %vb, i32 z
 ; ZVFHMIN-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v12, v9
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m2, ta, ma
-; ZVFHMIN-NEXT:    vmfeq.vv v8, v12, v12
-; ZVFHMIN-NEXT:    vmerge.vvm v14, v10, v12, v0
-; ZVFHMIN-NEXT:    vmv1r.v v0, v8
-; ZVFHMIN-NEXT:    vmerge.vvm v8, v12, v10, v0
-; ZVFHMIN-NEXT:    vfmax.vv v10, v8, v14
+; ZVFHMIN-NEXT:    vmerge.vvm v8, v10, v12, v0
+; ZVFHMIN-NEXT:    vmfeq.vv v0, v12, v12
+; ZVFHMIN-NEXT:    vmerge.vvm v10, v12, v10, v0
+; ZVFHMIN-NEXT:    vfmax.vv v10, v10, v8
 ; ZVFHMIN-NEXT:    vsetivli zero, 8, e16, m1, ta, ma
 ; ZVFHMIN-NEXT:    vfncvt.f.f.w v8, v10
 ; ZVFHMIN-NEXT:    ret
@@ -274,11 +268,10 @@ define <16 x half> @vfmax_vv_v16f16_unmasked(<16 x half> %va, <16 x half> %vb, i
 ; ZVFH:       # %bb.0:
 ; ZVFH-NEXT:    vsetvli zero, a0, e16, m2, ta, ma
 ; ZVFH-NEXT:    vmfeq.vv v0, v8, v8
-; ZVFH-NEXT:    vmfeq.vv v12, v10, v10
-; ZVFH-NEXT:    vmerge.vvm v14, v8, v10, v0
-; ZVFH-NEXT:    vmv1r.v v0, v12
+; ZVFH-NEXT:    vmerge.vvm v12, v8, v10, v0
+; ZVFH-NEXT:    vmfeq.vv v0, v10, v10
 ; ZVFH-NEXT:    vmerge.vvm v8, v10, v8, v0
-; ZVFH-NEXT:    vfmax.vv v8, v8, v14
+; ZVFH-NEXT:    vfmax.vv v8, v8, v12
 ; ZVFH-NEXT:    ret
 ;
 ; ZVFHMIN-LABEL: vfmax_vv_v16f16_unmasked:
@@ -290,11 +283,10 @@ define <16 x half> @vfmax_vv_v16f16_unmasked(<16 x half> %va, <16 x half> %vb, i
 ; ZVFHMIN-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; ZVFHMIN-NEXT:    vfwcvt.f.f.v v16, v10
 ; ZVFHMIN-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
-; ZVFHMIN-NEXT:    vmfeq.vv v8, v16, v16
-; ZVFHMIN-NEXT:    vmerge.vvm v20, v12, v16, v0
-; ZVFHMIN-NEXT:    vmv1r.v v0, v8
-; ZVFHMIN-NEXT:    vmerge.vvm v8, v16, v12, v0
-; ZVFHMIN-NEXT:    vfmax.vv v12, v8, v20
+; ZVFHMIN-NEXT:    vmerge.vvm v8, v12, v16, v0
+; ZVFHMIN-NEXT:    vmfeq.vv v0, v16, v16
+; ZVFHMIN-NEXT:    vmerge.vvm v12, v16, v12, v0
+; ZVFHMIN-NEXT:    vfmax.vv v12, v12, v8
 ; ZVFHMIN-NEXT:    vsetivli zero, 16, e16, m2, ta, ma
 ; ZVFHMIN-NEXT:    vfncvt.f.f.w v8, v12
 ; ZVFHMIN-NEXT:    ret
@@ -326,11 +318,10 @@ define <2 x float> @vfmax_vv_v2f32_unmasked(<2 x float> %va, <2 x float> %vb, i3
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e32, mf2, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v10, v9, v9
-; CHECK-NEXT:    vmerge.vvm v11, v8, v9, v0
-; CHECK-NEXT:    vmv1r.v v0, v10
+; CHECK-NEXT:    vmerge.vvm v10, v8, v9, v0
+; CHECK-NEXT:    vmfeq.vv v0, v9, v9
 ; CHECK-NEXT:    vmerge.vvm v8, v9, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v11
+; CHECK-NEXT:    vfmax.vv v8, v8, v10
 ; CHECK-NEXT:    ret
   %v = call <2 x float> @llvm.vp.maximum.v2f32(<2 x float> %va, <2 x float> %vb, <2 x i1> splat (i1 true), i32 %evl)
   ret <2 x float> %v
@@ -360,11 +351,10 @@ define <4 x float> @vfmax_vv_v4f32_unmasked(<4 x float> %va, <4 x float> %vb, i3
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v10, v9, v9
-; CHECK-NEXT:    vmerge.vvm v11, v8, v9, v0
-; CHECK-NEXT:    vmv.v.v v0, v10
+; CHECK-NEXT:    vmerge.vvm v10, v8, v9, v0
+; CHECK-NEXT:    vmfeq.vv v0, v9, v9
 ; CHECK-NEXT:    vmerge.vvm v8, v9, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v11
+; CHECK-NEXT:    vfmax.vv v8, v8, v10
 ; CHECK-NEXT:    ret
   %v = call <4 x float> @llvm.vp.maximum.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> splat (i1 true), i32 %evl)
   ret <4 x float> %v
@@ -396,11 +386,10 @@ define <8 x float> @vfmax_vv_v8f32_unmasked(<8 x float> %va, <8 x float> %vb, i3
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m2, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v12, v10, v10
-; CHECK-NEXT:    vmerge.vvm v14, v8, v10, v0
-; CHECK-NEXT:    vmv1r.v v0, v12
+; CHECK-NEXT:    vmerge.vvm v12, v8, v10, v0
+; CHECK-NEXT:    vmfeq.vv v0, v10, v10
 ; CHECK-NEXT:    vmerge.vvm v8, v10, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v14
+; CHECK-NEXT:    vfmax.vv v8, v8, v12
 ; CHECK-NEXT:    ret
   %v = call <8 x float> @llvm.vp.maximum.v8f32(<8 x float> %va, <8 x float> %vb, <8 x i1> splat (i1 true), i32 %evl)
   ret <8 x float> %v
@@ -432,11 +421,10 @@ define <16 x float> @vfmax_vv_v16f32_unmasked(<16 x float> %va, <16 x float> %vb
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v16, v12, v12
-; CHECK-NEXT:    vmerge.vvm v20, v8, v12, v0
-; CHECK-NEXT:    vmv1r.v v0, v16
+; CHECK-NEXT:    vmerge.vvm v16, v8, v12, v0
+; CHECK-NEXT:    vmfeq.vv v0, v12, v12
 ; CHECK-NEXT:    vmerge.vvm v8, v12, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v20
+; CHECK-NEXT:    vfmax.vv v8, v8, v16
 ; CHECK-NEXT:    ret
   %v = call <16 x float> @llvm.vp.maximum.v16f32(<16 x float> %va, <16 x float> %vb, <16 x i1> splat (i1 true), i32 %evl)
   ret <16 x float> %v
@@ -466,11 +454,10 @@ define <2 x double> @vfmax_vv_v2f64_unmasked(<2 x double> %va, <2 x double> %vb,
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v10, v9, v9
-; CHECK-NEXT:    vmerge.vvm v11, v8, v9, v0
-; CHECK-NEXT:    vmv.v.v v0, v10
+; CHECK-NEXT:    vmerge.vvm v10, v8, v9, v0
+; CHECK-NEXT:    vmfeq.vv v0, v9, v9
 ; CHECK-NEXT:    vmerge.vvm v8, v9, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v11
+; CHECK-NEXT:    vfmax.vv v8, v8, v10
 ; CHECK-NEXT:    ret
   %v = call <2 x double> @llvm.vp.maximum.v2f64(<2 x double> %va, <2 x double> %vb, <2 x i1> splat (i1 true), i32 %evl)
   ret <2 x double> %v
@@ -502,11 +489,10 @@ define <4 x double> @vfmax_vv_v4f64_unmasked(<4 x double> %va, <4 x double> %vb,
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e64, m2, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v12, v10, v10
-; CHECK-NEXT:    vmerge.vvm v14, v8, v10, v0
-; CHECK-NEXT:    vmv1r.v v0, v12
+; CHECK-NEXT:    vmerge.vvm v12, v8, v10, v0
+; CHECK-NEXT:    vmfeq.vv v0, v10, v10
 ; CHECK-NEXT:    vmerge.vvm v8, v10, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v14
+; CHECK-NEXT:    vfmax.vv v8, v8, v12
 ; CHECK-NEXT:    ret
   %v = call <4 x double> @llvm.vp.maximum.v4f64(<4 x double> %va, <4 x double> %vb, <4 x i1> splat (i1 true), i32 %evl)
   ret <4 x double> %v
@@ -538,11 +524,10 @@ define <8 x double> @vfmax_vv_v8f64_unmasked(<8 x double> %va, <8 x double> %vb,
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e64, m4, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v16, v12, v12
-; CHECK-NEXT:    vmerge.vvm v20, v8, v12, v0
-; CHECK-NEXT:    vmv1r.v v0, v16
+; CHECK-NEXT:    vmerge.vvm v16, v8, v12, v0
+; CHECK-NEXT:    vmfeq.vv v0, v12, v12
 ; CHECK-NEXT:    vmerge.vvm v8, v12, v8, v0
-; CHECK-NEXT:    vfmax.vv v8, v8, v20
+; CHECK-NEXT:    vfmax.vv v8, v8, v16
 ; CHECK-NEXT:    ret
   %v = call <8 x double> @llvm.vp.maximum.v8f64(<8 x double> %va, <8 x double> %vb, <8 x i1> splat (i1 true), i32 %evl)
   ret <8 x double> %v
@@ -587,9 +572,8 @@ define <16 x double> @vfmax_vv_v16f64_unmasked(<16 x double> %va, <16 x double>
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetvli zero, a0, e64, m8, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v7, v16, v16
 ; CHECK-NEXT:    vmerge.vvm v24, v8, v16, v0
-; CHECK-NEXT:    vmv1r.v v0, v7
+; CHECK-NEXT:    vmfeq.vv v0, v16, v16
 ; CHECK-NEXT:    vmerge.vvm v8, v16, v8, v0
 ; CHECK-NEXT:    vfmax.vv v8, v8, v24
 ; CHECK-NEXT:    ret
@@ -710,21 +694,25 @@ define <32 x double> @vfmax_vv_v32f64_unmasked(<32 x double> %va, <32 x double>
 ; CHECK-NEXT:    addi sp, sp, -16
 ; CHECK-NEXT:    .cfi_def_cfa_offset 16
 ; CHECK-NEXT:    csrr a1, vlenb
-; CHECK-NEXT:    li a3, 24
+; CHECK-NEXT:    li a3, 40
 ; CHECK-NEXT:    mul a1, a1, a3
 ; CHECK-NEXT:    sub sp, sp, a1
-; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x18, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 24 * vlenb
+; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x28, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 40 * vlenb
 ; CHECK-NEXT:    addi a1, a0, 128
 ; CHECK-NEXT:    vsetivli zero, 16, e64, m8, ta, ma
 ; CHECK-NEXT:    vle64.v v24, (a1)
 ; CHECK-NEXT:    csrr a1, vlenb
-; CHECK-NEXT:    slli a1, a1, 4
+; CHECK-NEXT:    li a3, 24
+; CHECK-NEXT:    mul a1, a1, a3
 ; CHECK-NEXT:    add a1, sp, a1
 ; CHECK-NEXT:    addi a1, a1, 16
 ; CHECK-NEXT:    vs8r.v v24, (a1) # Unknown-size Folded Spill
 ; CHECK-NEXT:    vle64.v v24, (a0)
 ; CHECK-NEXT:    li a1, 16
-; CHECK-NEXT:    addi a0, sp, 16
+; CHECK-NEXT:    csrr a0, vlenb
+; CHECK-NEXT:    slli a0, a0, 5
+; CHECK-NEXT:    add a0, sp, a0
+; CHECK-NEXT:    addi a0, a0, 16
 ; CHECK-NEXT:    vs8r.v v16, (a0) # Unknown-size Folded Spill
 ; CHECK-NEXT:    mv a0, a2
 ; CHECK-NEXT:    bltu a2, a1, .LBB25_2
@@ -733,52 +721,66 @@ define <32 x double> @vfmax_vv_v32f64_unmasked(<32 x double> %va, <32 x double>
 ; CHECK-NEXT:  .LBB25_2:
 ; CHECK-NEXT:    vsetvli zero, a0, e64, m8, ta, ma
 ; CHECK-NEXT:    vmfeq.vv v0, v8, v8
-; CHECK-NEXT:    vmfeq.vv v7, v24, v24
 ; CHECK-NEXT:    vmv8r.v v16, v24
 ; CHECK-NEXT:    vmerge.vvm v24, v8, v24, v0
-; CHECK-NEXT:    csrr a0, vlenb
-; CHECK-NEXT:    slli a0, a0, 3
-; CHECK-NEXT:    add a0, sp, a0
-; CHECK-NEXT:    addi a0, a0, 16
-; CHECK...
[truncated]

wangpc-pp · 2024-10-29T13:15:02Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum.ll

-; ZVFHMIN-NEXT:    vmerge.vvm v8, v10, v9, v0
-; ZVFHMIN-NEXT:    vfmax.vv v9, v8, v11
+; ZVFHMIN-NEXT:    vmfeq.vv v0, v10, v10
+; ZVFHMIN-NEXT:    vsetvli zero, zero, e16, mf4, ta, ma


An example that causes more vtype toggles.

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp

michaelmaitland

I want to make sure I understand what is going on in this patch.

Prior to this patch, we should have data dependency edges between an instruction that consumes the mask and the instruction that produces it. This patch on the other hand adds artificial edges between v0 mask producer instruction and the previous consumer of a v0 mask.

We might have a program like this:

livein: v0
v0_a = produce v0
v0_b = consume_and_produce v0
c = consume v0

Without this patch, we have two data dependency edges: (v0_b, v0_a) and (c, v0_b). With this patch, I think are adding an artificial edge from (v0_a, v0_b).

Can you help me understand how this leads to making live intervals of mask registers shorter?

preames · 2024-10-29T23:43:25Z

Can you help me understand how this leads to making live intervals of mask registers shorter?

I think the idea is that the artificial edge constrains the scheduler from reordering the mask producer with the earlier mask use, thus preserving a non-overlapping live range if one already exists. My question is why we need this? Shouldn't register pressure on the mask register class (which only has one register) achieve this result? It clearly doesn't, but why? Is there something else we could tweak here?

Note that this code doesn't appear to consider the case where the original schedule already has V0 live ranges which overlap. That's probably fixable with some one use checks.

wangpc-pp · 2024-10-30T04:09:51Z

@michaelmaitland I think @preames has answered your question perfectly, the aim of this patch is to reduce the live range overlap (yeah I should use this term) of mask registers.

My question is why we need this? Shouldn't register pressure on the mask register class (which only has one register) achieve this result? It clearly doesn't, but why? Is there something else we could tweak here?

I'll try to answer these questions with my rough understanding. There are some problems with current implementation in LLVM:

When tracking register pressure, we don't track physical registers. This is not an issue I think.
We do have a RegisterClass for mask reigster (which is VMV0), but we don't use it in most RVV pseudos (only used in inline asm constraint and add/sub with carry instructions). We use physical register V0 directly and insert a $v0 = COPY before the use. @lukel97 and I have tried to do the same thing (replace V0 with VMV0: [RISCV] Don't use V0 directly in patterns #88496) before, but we encountered the same fundamental issue that causes error: ran out of registers during register allocation because of early-clobber constraint. We can revisit this approach if we think it is the correct way.
For mask producer, we are using VR register class (we can allocate V0-V31 to it). So of course, register pressure tracker won't think it is a problem. If V0 is not available, there are still 31 registers out there.

Yes, as what I can see, I have to admit that this patch seems to be a compromise.

preames · 2024-10-30T16:16:39Z

Yes, as what I can see, I have to admit that this patch seems to be a compromise.

I'm fine with this patch as long as we know what we're working around. :)

preames

LGTM - Seems like a reasonable workaround for a real issue and a few days have gone by with no other suggestions made.

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp

wangpc-pp · 2024-11-27T08:35:23Z

I will land this in a few days if there is no more comment.

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp

topperc

LGTM

Here we add a scheduling mutation in pre-ra scheduling, which will adds an artificial dependency edge between mask producer and its previous nearest instruction that uses V0 register. This prevents the overlap of live intervals of mask registers and as a consequence we can reduce some spills/moves. From the test changes, we can see some improvements and also some regressions (more vtype toggles). Partially fixes llvm#113489.

llvm-ci · 2024-11-29T08:32:37Z

LLVM Buildbot has detected a new failure on builder llvm-clang-win-x-armv7l running on as-builder-1 while building llvm at step 14 "test-check-cxxabi-armv7-unknown-linux-gnueabihf".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/38/builds/1023

Here is the relevant piece of the build log for the reference

Step 14 (test-check-cxxabi-armv7-unknown-linux-gnueabihf) failure: Test just built components: check-cxxabi-armv7-unknown-linux-gnueabihf completed (failure)
******************** TEST 'llvm-libc++abi-static.cfg.in :: uncaught_exception.pass.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# COMPILED WITH
C:/buildbot/as-builder-1/x-armv7l/build/./bin/clang++.exe C:\buildbot\as-builder-1\x-armv7l\llvm-project\libcxxabi\test\uncaught_exception.pass.cpp  --target=armv7-unknown-linux-gnueabihf -nostdinc++ -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include/c++/v1 -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include/armv7-unknown-linux-gnueabihf/c++/v1 -I "C:/buildbot/as-builder-1/x-armv7l/llvm-project/libunwind/include" -I C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxxabi/../libcxx/test/support -I C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxxabi/../libcxx/src -D_LIBCPP_ENABLE_CXX17_REMOVED_UNEXPECTED_FUNCTIONS -std=c++26 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-noexcept-type -Wno-atomic-alignment -Wno-reserved-module-identifier -Wdeprecated-copy -Wdeprecated-copy-dtor -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -Wno-local-type-template-args -Wno-c++11-extensions -Wno-unknown-pragmas -Wno-pass-failed -Wno-mismatched-new-delete -Wno-redundant-move -Wno-self-move -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings  -nostdlib++ -L C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/lib/armv7-unknown-linux-gnueabihf -lc++ -lc++abi -pthread -latomic -o C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir\t.tmp.exe
# executed command: C:/buildbot/as-builder-1/x-armv7l/build/./bin/clang++.exe 'C:\buildbot\as-builder-1\x-armv7l\llvm-project\libcxxabi\test\uncaught_exception.pass.cpp' --target=armv7-unknown-linux-gnueabihf -nostdinc++ -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include/c++/v1 -I C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/include/armv7-unknown-linux-gnueabihf/c++/v1 -I C:/buildbot/as-builder-1/x-armv7l/llvm-project/libunwind/include -I C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxxabi/../libcxx/test/support -I C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxxabi/../libcxx/src -D_LIBCPP_ENABLE_CXX17_REMOVED_UNEXPECTED_FUNCTIONS -std=c++26 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-noexcept-type -Wno-atomic-alignment -Wno-reserved-module-identifier -Wdeprecated-copy -Wdeprecated-copy-dtor -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -Wno-local-type-template-args -Wno-c++11-extensions -Wno-unknown-pragmas -Wno-pass-failed -Wno-mismatched-new-delete -Wno-redundant-move -Wno-self-move -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings -nostdlib++ -L C:/buildbot/as-builder-1/x-armv7l/build/runtimes/runtimes-armv7-unknown-linux-gnueabihf-bins/libcxxabi/test-suite-install/lib/armv7-unknown-linux-gnueabihf -lc++ -lc++abi -pthread -latomic -o 'C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir\t.tmp.exe'
# EXECUTED AS
"C:/Python310/python.exe" "C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxx/utils/ssh.py" [email protected] --execdir C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir --  C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir\t.tmp.exe
# executed command: C:/Python310/python.exe C:/buildbot/as-builder-1/x-armv7l/llvm-project/libcxx/utils/ssh.py [email protected] --execdir 'C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir' -- 'C:\buildbot\as-builder-1\x-armv7l\build\runtimes\runtimes-armv7-unknown-linux-gnueabihf-bins\libcxxabi\test\Output\uncaught_exception.pass.cpp.dir\t.tmp.exe'
# .---command stderr------------
# | Traceback (most recent call last):
# |   File "C:\buildbot\as-builder-1\x-armv7l\llvm-project\libcxx\utils\ssh.py", line 145, in <module>
# |     exit(main())
# |   File "C:\buildbot\as-builder-1\x-armv7l\llvm-project\libcxx\utils\ssh.py", line 61, in main
# |     tmp = runCommand(
# |   File "C:\buildbot\as-builder-1\x-armv7l\llvm-project\libcxx\utils\ssh.py", line 57, in runCommand
# |     return subprocess.run(command, *args_, **kwargs)
# |   File "c:\python310\lib\subprocess.py", line 524, in run
# |     raise CalledProcessError(retcode, process.args,
# | subprocess.CalledProcessError: Command '['ssh', '-oBatchMode=yes', '[email protected]', 'mktemp -d /tmp/libcxx.XXXXXXXXXX']' returned non-zero exit status 255.
# `-----------------------------
# error: command failed with exit status: 1

--

********************

llvmbot added the backend:RISC-V label Oct 29, 2024

wangpc-pp requested a review from BeMg October 29, 2024 07:35

wangpc-pp requested review from preames, lukel97, mshockwave, michaelmaitland and topperc October 29, 2024 07:35

wangpc-pp force-pushed the main-riscv-vector-mask-mutation branch from b888e06 to 327d658 Compare October 29, 2024 07:37

wangpc-pp commented Oct 29, 2024

View reviewed changes

topperc reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp Outdated Show resolved Hide resolved

lukel97 reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp Show resolved Hide resolved

michaelmaitland reviewed Oct 29, 2024

View reviewed changes

preames approved these changes Nov 8, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp Outdated Show resolved Hide resolved

wangpc-pp force-pushed the main-riscv-vector-mask-mutation branch from f21427c to 75b04ec Compare November 27, 2024 08:34

topperc reviewed Nov 27, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVVectorMaskDAGMutation.cpp Outdated Show resolved Hide resolved

topperc approved these changes Nov 28, 2024

View reviewed changes

wangpc-pp force-pushed the main-riscv-vector-mask-mutation branch from 56712da to 055f429 Compare November 29, 2024 05:25

wangpc-pp merged commit 01a15dc into llvm:main Nov 29, 2024
4 of 6 checks passed

wangpc-pp deleted the main-riscv-vector-mask-mutation branch November 29, 2024 05:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Set a barrier between mask producer and user of V0 #114012

[RISCV] Set a barrier between mask producer and user of V0 #114012

wangpc-pp commented Oct 29, 2024 •

edited

Loading

llvmbot commented Oct 29, 2024

wangpc-pp Oct 29, 2024

michaelmaitland left a comment •

edited

Loading

preames commented Oct 29, 2024

wangpc-pp commented Oct 30, 2024 •

edited

Loading

preames commented Oct 30, 2024

preames left a comment

wangpc-pp commented Nov 27, 2024

topperc left a comment

llvm-ci commented Nov 29, 2024

[RISCV] Set a barrier between mask producer and user of V0 #114012

[RISCV] Set a barrier between mask producer and user of V0 #114012

Conversation

wangpc-pp commented Oct 29, 2024 • edited Loading

llvmbot commented Oct 29, 2024

wangpc-pp Oct 29, 2024

Choose a reason for hiding this comment

michaelmaitland left a comment • edited Loading

Choose a reason for hiding this comment

preames commented Oct 29, 2024

wangpc-pp commented Oct 30, 2024 • edited Loading

preames commented Oct 30, 2024

preames left a comment

Choose a reason for hiding this comment

wangpc-pp commented Nov 27, 2024

topperc left a comment

Choose a reason for hiding this comment

llvm-ci commented Nov 29, 2024

wangpc-pp commented Oct 29, 2024 •

edited

Loading

michaelmaitland left a comment •

edited

Loading

wangpc-pp commented Oct 30, 2024 •

edited

Loading