[RISCV] Software guard direct calls in large code model #109377

jaidTw · 2024-09-20T06:01:01Z

Recently #70308 added support for large code model, with Zicfilp enabled, sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered.

This patch changes direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)

llvmbot · 2024-09-20T06:01:37Z

@llvm/pr-subscribers-backend-risc-v

Author: Jesse Huang (jaidTw)

Changes

Patch is 49.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109377.diff

6 Files Affected:

(modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp (+2-1)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+24-3)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+5-2)
(modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+19-3)
(modified) llvm/test/CodeGen/RISCV/calls.ll (+395-115)
(modified) llvm/test/CodeGen/RISCV/tail-calls.ll (+270-2)

diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index 75323632dd5333..12ee6705fc4366 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -125,11 +125,12 @@ void RISCVMCCodeEmitter::expandFunctionCall(const MCInst &MI,
   MCRegister Ra;
   if (MI.getOpcode() == RISCV::PseudoTAIL) {
     Func = MI.getOperand(0);
-    Ra = RISCV::X6;
     // For Zicfilp, PseudoTAIL should be expanded to a software guarded branch.
     // It means to use t2(x7) as rs1 of JALR to expand PseudoTAIL.
     if (STI.hasFeature(RISCV::FeatureStdExtZicfilp))
       Ra = RISCV::X7;
+    else
+      Ra = RISCV::X6;
   } else if (MI.getOpcode() == RISCV::PseudoCALLReg) {
     Func = MI.getOperand(1);
     Ra = MI.getOperand(0).getReg();
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index c4458b14f36ece..b6d61aad639960 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -19696,11 +19696,14 @@ SDValue RISCVTargetLowering::LowerCall(CallLoweringInfo &CLI,
   // If the callee is a GlobalAddress/ExternalSymbol node, turn it into a
   // TargetGlobalAddress/TargetExternalSymbol node so that legalize won't
   // split it and then direct call can be matched by PseudoCALL.
+  bool CalleeIsLargeExternalSymbol = false;
   if (getTargetMachine().getCodeModel() == CodeModel::Large) {
     if (auto *S = dyn_cast<GlobalAddressSDNode>(Callee))
       Callee = getLargeGlobalAddress(S, DL, PtrVT, DAG);
-    else if (auto *S = dyn_cast<ExternalSymbolSDNode>(Callee))
+    else if (auto *S = dyn_cast<ExternalSymbolSDNode>(Callee)) {
       Callee = getLargeExternalSymbol(S, DL, PtrVT, DAG);
+      CalleeIsLargeExternalSymbol = true;
+    }
   } else if (GlobalAddressSDNode *S = dyn_cast<GlobalAddressSDNode>(Callee)) {
     const GlobalValue *GV = S->getGlobal();
     Callee = DAG.getTargetGlobalAddress(GV, DL, PtrVT, 0, RISCVII::MO_CALL);
@@ -19736,16 +19739,32 @@ SDValue RISCVTargetLowering::LowerCall(CallLoweringInfo &CLI,
   // Emit the call.
   SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
 
+  // Use software guarded branch for large code model non-indirect calls
+  // Tail call to external symbol will have a null CLI.CB and we need another
+  // way to determine the callsite type
+  bool NeedSWGuarded = false;
+  if (getTargetMachine().getCodeModel() == CodeModel::Large &&
+      Subtarget.hasStdExtZicfilp() &&
+      ((CLI.CB && !CLI.CB->isIndirectCall()) || CalleeIsLargeExternalSymbol))
+    NeedSWGuarded = true;
+
   if (IsTailCall) {
     MF.getFrameInfo().setHasTailCall();
-    SDValue Ret = DAG.getNode(RISCVISD::TAIL, DL, NodeTys, Ops);
+    SDValue Ret;
+    if (NeedSWGuarded)
+      Ret = DAG.getNode(RISCVISD::SW_GUARDED_TAIL, DL, NodeTys, Ops);
+    else
+      Ret = DAG.getNode(RISCVISD::TAIL, DL, NodeTys, Ops);
     if (CLI.CFIType)
       Ret.getNode()->setCFIType(CLI.CFIType->getZExtValue());
     DAG.addNoMergeSiteInfo(Ret.getNode(), CLI.NoMerge);
     return Ret;
   }
 
-  Chain = DAG.getNode(RISCVISD::CALL, DL, NodeTys, Ops);
+  if (NeedSWGuarded)
+    Chain = DAG.getNode(RISCVISD::SW_GUARDED_CALL, DL, NodeTys, Ops);
+  else
+    Chain = DAG.getNode(RISCVISD::CALL, DL, NodeTys, Ops);
   if (CLI.CFIType)
     Chain.getNode()->setCFIType(CLI.CFIType->getZExtValue());
   DAG.addNoMergeSiteInfo(Chain.getNode(), CLI.NoMerge);
@@ -20193,6 +20212,8 @@ const char *RISCVTargetLowering::getTargetNodeName(unsigned Opcode) const {
   NODE_NAME_CASE(CZERO_EQZ)
   NODE_NAME_CASE(CZERO_NEZ)
   NODE_NAME_CASE(SW_GUARDED_BRIND)
+  NODE_NAME_CASE(SW_GUARDED_CALL)
+  NODE_NAME_CASE(SW_GUARDED_TAIL)
   NODE_NAME_CASE(TUPLE_INSERT)
   NODE_NAME_CASE(TUPLE_EXTRACT)
   NODE_NAME_CASE(SF_VC_XV_SE)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index ceb9d499002846..05581552ab6041 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -411,9 +411,12 @@ enum NodeType : unsigned {
   CZERO_EQZ, // vt.maskc for XVentanaCondOps.
   CZERO_NEZ, // vt.maskcn for XVentanaCondOps.
 
-  /// Software guarded BRIND node. Operand 0 is the chain operand and
-  /// operand 1 is the target address.
+  // Software guarded BRIND node. Operand 0 is the chain operand and
+  // operand 1 is the target address.
   SW_GUARDED_BRIND,
+  // Software guarded calls for large code model
+  SW_GUARDED_CALL,
+  SW_GUARDED_TAIL,
 
   SF_VC_XV_SE,
   SF_VC_IV_SE,
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index fe5623e2920e22..c4e192e5b35790 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -57,6 +57,9 @@ def callseq_end   : SDNode<"ISD::CALLSEQ_END", SDT_CallSeqEnd,
 def riscv_call      : SDNode<"RISCVISD::CALL", SDT_RISCVCall,
                              [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
                               SDNPVariadic]>;
+def riscv_sw_guarded_call : SDNode<"RISCVISD::SW_GUARDED_CALL", SDT_RISCVCall,
+                                   [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+                                    SDNPVariadic]>;
 def riscv_ret_glue  : SDNode<"RISCVISD::RET_GLUE", SDTNone,
                              [SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
 def riscv_sret_glue : SDNode<"RISCVISD::SRET_GLUE", SDTNone,
@@ -69,6 +72,9 @@ def riscv_brcc      : SDNode<"RISCVISD::BR_CC", SDT_RISCVBrCC,
 def riscv_tail      : SDNode<"RISCVISD::TAIL", SDT_RISCVCall,
                              [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
                               SDNPVariadic]>;
+def riscv_sw_guarded_tail : SDNode<"RISCVISD::SW_GUARDED_TAIL", SDT_RISCVCall,
+                                   [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+                                    SDNPVariadic]>;
 def riscv_sw_guarded_brind : SDNode<"RISCVISD::SW_GUARDED_BRIND",
                                     SDTBrind, [SDNPHasChain]>;
 def riscv_sllw      : SDNode<"RISCVISD::SLLW", SDT_RISCVIntBinOpW>;
@@ -1555,10 +1561,15 @@ let Predicates = [NoStdExtZicfilp] in
 def PseudoCALLIndirect : Pseudo<(outs), (ins GPRJALR:$rs1),
                                 [(riscv_call GPRJALR:$rs1)]>,
                          PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
-let Predicates = [HasStdExtZicfilp] in
+let Predicates = [HasStdExtZicfilp] in {
 def PseudoCALLIndirectNonX7 : Pseudo<(outs), (ins GPRJALRNonX7:$rs1),
-                                     [(riscv_call GPRJALRNonX7:$rs1)]>,
+                                    [(riscv_call GPRJALRNonX7:$rs1)]>,
                               PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
+// For large code model, non-indirect calls could be software-guarded
+def PseudoCALLIndirectX7 : Pseudo<(outs), (ins GPRX7:$rs1),
+                                  [(riscv_sw_guarded_call GPRX7:$rs1)]>,
+                           PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
+}
 }
 
 let isBarrier = 1, isReturn = 1, isTerminator = 1 in
@@ -1579,10 +1590,15 @@ let Predicates = [NoStdExtZicfilp] in
 def PseudoTAILIndirect : Pseudo<(outs), (ins GPRTC:$rs1),
                                 [(riscv_tail GPRTC:$rs1)]>,
                          PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
-let Predicates = [HasStdExtZicfilp] in
+let Predicates = [HasStdExtZicfilp] in {
 def PseudoTAILIndirectNonX7 : Pseudo<(outs), (ins GPRTCNonX7:$rs1),
                                      [(riscv_tail GPRTCNonX7:$rs1)]>,
                               PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
+// For large code model, non-indirect calls could be software-guarded
+def PseudoTAILIndirectX7 : Pseudo<(outs), (ins GPRX7:$rs1),
+                                  [(riscv_sw_guarded_tail GPRX7:$rs1)]>,
+                           PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
+}
 }
 
 def : Pat<(riscv_tail (iPTR tglobaladdr:$dst)),
diff --git a/llvm/test/CodeGen/RISCV/calls.ll b/llvm/test/CodeGen/RISCV/calls.ll
index 598a026fb95526..48dfe453664a90 100644
--- a/llvm/test/CodeGen/RISCV/calls.ll
+++ b/llvm/test/CodeGen/RISCV/calls.ll
@@ -11,18 +11,29 @@
 ; RUN:   | FileCheck -check-prefix=RV64I-MEDIUM %s
 ; RUN: llc -code-model=large -mtriple=riscv64 -verify-machineinstrs < %s \
 ; RUN:   | FileCheck -check-prefix=RV64I-LARGE %s
+; RUN: llc -code-model=large -mtriple=riscv64 -mattr=experimental-zicfilp -verify-machineinstrs < %s \
+; RUN:   | FileCheck -check-prefix=RV64I-LARGE-ZICFILP %s
 
 declare i32 @external_function(i32)
 
 define i32 @test_call_external(i32 %a) nounwind {
-; CHECK-LABEL: test_call_external:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT:    call external_function
-; CHECK-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    ret
+; RV32I-LABEL: test_call_external:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    call external_function
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: test_call_external:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi sp, sp, -16
+; RV32I-PIC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT:    call external_function
+; RV32I-PIC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT:    addi sp, sp, 16
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: test_call_external:
 ; RV64I:       # %bb.0:
@@ -62,6 +73,19 @@ define i32 @test_call_external(i32 %a) nounwind {
 ; RV64I-LARGE-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
 ; RV64I-LARGE-NEXT:    addi sp, sp, 16
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_external:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT:  .Lpcrel_hi0:
+; RV64I-LARGE-ZICFILP-NEXT:    auipc a1, %pcrel_hi(.LCPI0_0)
+; RV64I-LARGE-ZICFILP-NEXT:    ld t2, %pcrel_lo(.Lpcrel_hi0)(a1)
+; RV64I-LARGE-ZICFILP-NEXT:    jalr t2
+; RV64I-LARGE-ZICFILP-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT:    ret
   %1 = call i32 @external_function(i32 %a)
   ret i32 %1
 }
@@ -69,14 +93,23 @@ define i32 @test_call_external(i32 %a) nounwind {
 declare dso_local i32 @dso_local_function(i32)
 
 define i32 @test_call_dso_local(i32 %a) nounwind {
-; CHECK-LABEL: test_call_dso_local:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT:    call dso_local_function
-; CHECK-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    ret
+; RV32I-LABEL: test_call_dso_local:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    call dso_local_function
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: test_call_dso_local:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi sp, sp, -16
+; RV32I-PIC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT:    call dso_local_function
+; RV32I-PIC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT:    addi sp, sp, 16
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: test_call_dso_local:
 ; RV64I:       # %bb.0:
@@ -116,15 +149,33 @@ define i32 @test_call_dso_local(i32 %a) nounwind {
 ; RV64I-LARGE-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
 ; RV64I-LARGE-NEXT:    addi sp, sp, 16
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_dso_local:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT:  .Lpcrel_hi1:
+; RV64I-LARGE-ZICFILP-NEXT:    auipc a1, %pcrel_hi(.LCPI1_0)
+; RV64I-LARGE-ZICFILP-NEXT:    ld t2, %pcrel_lo(.Lpcrel_hi1)(a1)
+; RV64I-LARGE-ZICFILP-NEXT:    jalr t2
+; RV64I-LARGE-ZICFILP-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT:    ret
   %1 = call i32 @dso_local_function(i32 %a)
   ret i32 %1
 }
 
 define i32 @defined_function(i32 %a) nounwind {
-; CHECK-LABEL: defined_function:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi a0, a0, 1
-; CHECK-NEXT:    ret
+; RV32I-LABEL: defined_function:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi a0, a0, 1
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: defined_function:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi a0, a0, 1
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: defined_function:
 ; RV64I:       # %bb.0:
@@ -145,19 +196,34 @@ define i32 @defined_function(i32 %a) nounwind {
 ; RV64I-LARGE:       # %bb.0:
 ; RV64I-LARGE-NEXT:    addiw a0, a0, 1
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: defined_function:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addiw a0, a0, 1
+; RV64I-LARGE-ZICFILP-NEXT:    ret
   %1 = add i32 %a, 1
   ret i32 %1
 }
 
 define i32 @test_call_defined(i32 %a) nounwind {
-; CHECK-LABEL: test_call_defined:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT:    call defined_function
-; CHECK-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    ret
+; RV32I-LABEL: test_call_defined:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    call defined_function
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: test_call_defined:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi sp, sp, -16
+; RV32I-PIC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT:    call defined_function
+; RV32I-PIC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT:    addi sp, sp, 16
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: test_call_defined:
 ; RV64I:       # %bb.0:
@@ -197,21 +263,45 @@ define i32 @test_call_defined(i32 %a) nounwind {
 ; RV64I-LARGE-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
 ; RV64I-LARGE-NEXT:    addi sp, sp, 16
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_defined:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT:  .Lpcrel_hi2:
+; RV64I-LARGE-ZICFILP-NEXT:    auipc a1, %pcrel_hi(.LCPI3_0)
+; RV64I-LARGE-ZICFILP-NEXT:    ld t2, %pcrel_lo(.Lpcrel_hi2)(a1)
+; RV64I-LARGE-ZICFILP-NEXT:    jalr t2
+; RV64I-LARGE-ZICFILP-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT:    ret
   %1 = call i32 @defined_function(i32 %a)
   ret i32 %1
 }
 
 define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
-; CHECK-LABEL: test_call_indirect:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT:    mv a2, a0
-; CHECK-NEXT:    mv a0, a1
-; CHECK-NEXT:    jalr a2
-; CHECK-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    ret
+; RV32I-LABEL: test_call_indirect:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    mv a2, a0
+; RV32I-NEXT:    mv a0, a1
+; RV32I-NEXT:    jalr a2
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: test_call_indirect:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi sp, sp, -16
+; RV32I-PIC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT:    mv a2, a0
+; RV32I-PIC-NEXT:    mv a0, a1
+; RV32I-PIC-NEXT:    jalr a2
+; RV32I-PIC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT:    addi sp, sp, 16
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: test_call_indirect:
 ; RV64I:       # %bb.0:
@@ -256,6 +346,18 @@ define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
 ; RV64I-LARGE-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
 ; RV64I-LARGE-NEXT:    addi sp, sp, 16
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_indirect:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT:    mv a2, a0
+; RV64I-LARGE-ZICFILP-NEXT:    mv a0, a1
+; RV64I-LARGE-ZICFILP-NEXT:    jalr a2
+; RV64I-LARGE-ZICFILP-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT:    ret
   %1 = call i32 %a(i32 %b)
   ret i32 %1
 }
@@ -263,22 +365,39 @@ define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
 ; Make sure we don't use t0 as the source for jalr as that is a hint to pop the
 ; return address stack on some microarchitectures.
 define i32 @test_call_indirect_no_t0(ptr %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h) nounwind {
-; CHECK-LABEL: test_call_indirect_no_t0:
-; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT:    mv t1, a0
-; CHECK-NEXT:    mv a0, a1
-; CHECK-NEXT:    mv a1, a2
-; CHECK-NEXT:    mv a2, a3
-; CHECK-NEXT:    mv a3, a4
-; CHECK-NEXT:    mv a4, a5
-; CHECK-NEXT:    mv a5, a6
-; CHECK-NEXT:    mv a6, a7
-; CHECK-NEXT:    jalr t1
-; CHECK-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    ret
+; RV32I-LABEL: test_call_indirect_no_t0:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -16
+; RV32I-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT:    mv t1, a0
+; RV32I-NEXT:    mv a0, a1
+; RV32I-NEXT:    mv a1, a2
+; RV32I-NEXT:    mv a2, a3
+; RV32I-NEXT:    mv a3, a4
+; RV32I-NEXT:    mv a4, a5
+; RV32I-NEXT:    mv a5, a6
+; RV32I-NEXT:    mv a6, a7
+; RV32I-NEXT:    jalr t1
+; RV32I-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT:    addi sp, sp, 16
+; RV32I-NEXT:    ret
+;
+; RV32I-PIC-LABEL: test_call_indirect_no_t0:
+; RV32I-PIC:       # %bb.0:
+; RV32I-PIC-NEXT:    addi sp, sp, -16
+; RV32I-PIC-NEXT:    sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT:    mv t1, a0
+; RV32I-PIC-NEXT:    mv a0, a1
+; RV32I-PIC-NEXT:    mv a1, a2
+; RV32I-PIC-NEXT:    mv a2, a3
+; RV32I-PIC-NEXT:    mv a3, a4
+; RV32I-PIC-NEXT:    mv a4, a5
+; RV32I-PIC-NEXT:    mv a5, a6
+; RV32I-PIC-NEXT:    mv a6, a7
+; RV32I-PIC-NEXT:    jalr t1
+; RV32I-PIC-NEXT:    lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT:    addi sp, sp, 16
+; RV32I-PIC-NEXT:    ret
 ;
 ; RV64I-LABEL: test_call_indirect_no_t0:
 ; RV64I:       # %bb.0:
@@ -347,6 +466,24 @@ define i32 @test_call_indirect_no_t0(ptr %a, i32 %b, i32 %c, i32 %d, i32 %e, i32
 ; RV64I-LARGE-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
 ; RV64I-LARGE-NEXT:    addi sp, sp, 16
 ; RV64I-LARGE-NEXT:    ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_indirect_no_t0:
+; RV64I-LARGE-ZICFILP:       # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT:    lpad 0
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT:    sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT:    mv t1, a0
+; RV64I-LARGE-ZICFILP-NEXT:    mv a0, a1
+; RV64I-LARGE-ZICFILP-NEXT:    mv a1, a2
+; RV64I-LARGE-ZICFILP-NEXT:    mv a2, a3
+; RV64I-LARGE-ZICFILP-NEXT:    mv a3, a4
+; RV64I-LARGE-ZICFILP-NEXT:    mv a4, a5
+; RV64I-LARGE-ZICFILP-NEXT:    mv a5, a6
+; RV64I-LARGE-ZICFILP-NEXT:    mv a6, a7
+; RV64I-LARGE-ZICFILP-NEXT:    jalr t1
+; RV64I-LARGE-ZICFILP-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT:    addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-...
[truncated]

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/test/CodeGen/RISCV/calls.ll

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp

topperc

LGTM

Support for large code model are added recently, and sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal with Zicfilp because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered. Change direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)

llvmbot added the backend:RISC-V label Sep 20, 2024

jaidTw requested review from topperc and yetingk September 20, 2024 06:07

jaidTw force-pushed the jaidtw/riscv-large-sw-guarded branch 2 times, most recently from a5aaed1 to 280efae Compare September 20, 2024 06:15

topperc reviewed Sep 20, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

jrtc27 reviewed Sep 20, 2024

View reviewed changes

llvm/test/CodeGen/RISCV/calls.ll Outdated Show resolved Hide resolved

jrtc27 reviewed Sep 20, 2024

View reviewed changes

llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp Outdated Show resolved Hide resolved

jaidTw added 3 commits September 22, 2024 13:05

[RISCV] Software-guard direct calls in large code model

8363da5

!fixup address comments

6b15a3f

!fixup update test

d0a1734

jaidTw force-pushed the jaidtw/riscv-large-sw-guarded branch from fa2801c to d0a1734 Compare September 22, 2024 05:05

topperc approved these changes Sep 24, 2024

View reviewed changes

jaidTw merged commit 9bdcf7a into llvm:main Sep 27, 2024
8 checks passed

jaidTw deleted the jaidtw/riscv-large-sw-guarded branch September 27, 2024 05:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Software guard direct calls in large code model #109377

[RISCV] Software guard direct calls in large code model #109377

jaidTw commented Sep 20, 2024 •

edited

Loading

llvmbot commented Sep 20, 2024

topperc left a comment

[RISCV] Software guard direct calls in large code model #109377

[RISCV] Software guard direct calls in large code model #109377

Conversation

jaidTw commented Sep 20, 2024 • edited Loading

llvmbot commented Sep 20, 2024

topperc left a comment

Choose a reason for hiding this comment

jaidTw commented Sep 20, 2024 •

edited

Loading