-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Software guard direct calls in large code model #109377
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@llvm/pr-subscribers-backend-risc-v Author: Jesse Huang (jaidTw) ChangesPatch is 49.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109377.diff 6 Files Affected:
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index 75323632dd5333..12ee6705fc4366 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -125,11 +125,12 @@ void RISCVMCCodeEmitter::expandFunctionCall(const MCInst &MI,
MCRegister Ra;
if (MI.getOpcode() == RISCV::PseudoTAIL) {
Func = MI.getOperand(0);
- Ra = RISCV::X6;
// For Zicfilp, PseudoTAIL should be expanded to a software guarded branch.
// It means to use t2(x7) as rs1 of JALR to expand PseudoTAIL.
if (STI.hasFeature(RISCV::FeatureStdExtZicfilp))
Ra = RISCV::X7;
+ else
+ Ra = RISCV::X6;
} else if (MI.getOpcode() == RISCV::PseudoCALLReg) {
Func = MI.getOperand(1);
Ra = MI.getOperand(0).getReg();
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index c4458b14f36ece..b6d61aad639960 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -19696,11 +19696,14 @@ SDValue RISCVTargetLowering::LowerCall(CallLoweringInfo &CLI,
// If the callee is a GlobalAddress/ExternalSymbol node, turn it into a
// TargetGlobalAddress/TargetExternalSymbol node so that legalize won't
// split it and then direct call can be matched by PseudoCALL.
+ bool CalleeIsLargeExternalSymbol = false;
if (getTargetMachine().getCodeModel() == CodeModel::Large) {
if (auto *S = dyn_cast<GlobalAddressSDNode>(Callee))
Callee = getLargeGlobalAddress(S, DL, PtrVT, DAG);
- else if (auto *S = dyn_cast<ExternalSymbolSDNode>(Callee))
+ else if (auto *S = dyn_cast<ExternalSymbolSDNode>(Callee)) {
Callee = getLargeExternalSymbol(S, DL, PtrVT, DAG);
+ CalleeIsLargeExternalSymbol = true;
+ }
} else if (GlobalAddressSDNode *S = dyn_cast<GlobalAddressSDNode>(Callee)) {
const GlobalValue *GV = S->getGlobal();
Callee = DAG.getTargetGlobalAddress(GV, DL, PtrVT, 0, RISCVII::MO_CALL);
@@ -19736,16 +19739,32 @@ SDValue RISCVTargetLowering::LowerCall(CallLoweringInfo &CLI,
// Emit the call.
SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Glue);
+ // Use software guarded branch for large code model non-indirect calls
+ // Tail call to external symbol will have a null CLI.CB and we need another
+ // way to determine the callsite type
+ bool NeedSWGuarded = false;
+ if (getTargetMachine().getCodeModel() == CodeModel::Large &&
+ Subtarget.hasStdExtZicfilp() &&
+ ((CLI.CB && !CLI.CB->isIndirectCall()) || CalleeIsLargeExternalSymbol))
+ NeedSWGuarded = true;
+
if (IsTailCall) {
MF.getFrameInfo().setHasTailCall();
- SDValue Ret = DAG.getNode(RISCVISD::TAIL, DL, NodeTys, Ops);
+ SDValue Ret;
+ if (NeedSWGuarded)
+ Ret = DAG.getNode(RISCVISD::SW_GUARDED_TAIL, DL, NodeTys, Ops);
+ else
+ Ret = DAG.getNode(RISCVISD::TAIL, DL, NodeTys, Ops);
if (CLI.CFIType)
Ret.getNode()->setCFIType(CLI.CFIType->getZExtValue());
DAG.addNoMergeSiteInfo(Ret.getNode(), CLI.NoMerge);
return Ret;
}
- Chain = DAG.getNode(RISCVISD::CALL, DL, NodeTys, Ops);
+ if (NeedSWGuarded)
+ Chain = DAG.getNode(RISCVISD::SW_GUARDED_CALL, DL, NodeTys, Ops);
+ else
+ Chain = DAG.getNode(RISCVISD::CALL, DL, NodeTys, Ops);
if (CLI.CFIType)
Chain.getNode()->setCFIType(CLI.CFIType->getZExtValue());
DAG.addNoMergeSiteInfo(Chain.getNode(), CLI.NoMerge);
@@ -20193,6 +20212,8 @@ const char *RISCVTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(CZERO_EQZ)
NODE_NAME_CASE(CZERO_NEZ)
NODE_NAME_CASE(SW_GUARDED_BRIND)
+ NODE_NAME_CASE(SW_GUARDED_CALL)
+ NODE_NAME_CASE(SW_GUARDED_TAIL)
NODE_NAME_CASE(TUPLE_INSERT)
NODE_NAME_CASE(TUPLE_EXTRACT)
NODE_NAME_CASE(SF_VC_XV_SE)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index ceb9d499002846..05581552ab6041 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -411,9 +411,12 @@ enum NodeType : unsigned {
CZERO_EQZ, // vt.maskc for XVentanaCondOps.
CZERO_NEZ, // vt.maskcn for XVentanaCondOps.
- /// Software guarded BRIND node. Operand 0 is the chain operand and
- /// operand 1 is the target address.
+ // Software guarded BRIND node. Operand 0 is the chain operand and
+ // operand 1 is the target address.
SW_GUARDED_BRIND,
+ // Software guarded calls for large code model
+ SW_GUARDED_CALL,
+ SW_GUARDED_TAIL,
SF_VC_XV_SE,
SF_VC_IV_SE,
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index fe5623e2920e22..c4e192e5b35790 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -57,6 +57,9 @@ def callseq_end : SDNode<"ISD::CALLSEQ_END", SDT_CallSeqEnd,
def riscv_call : SDNode<"RISCVISD::CALL", SDT_RISCVCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;
+def riscv_sw_guarded_call : SDNode<"RISCVISD::SW_GUARDED_CALL", SDT_RISCVCall,
+ [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+ SDNPVariadic]>;
def riscv_ret_glue : SDNode<"RISCVISD::RET_GLUE", SDTNone,
[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
def riscv_sret_glue : SDNode<"RISCVISD::SRET_GLUE", SDTNone,
@@ -69,6 +72,9 @@ def riscv_brcc : SDNode<"RISCVISD::BR_CC", SDT_RISCVBrCC,
def riscv_tail : SDNode<"RISCVISD::TAIL", SDT_RISCVCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;
+def riscv_sw_guarded_tail : SDNode<"RISCVISD::SW_GUARDED_TAIL", SDT_RISCVCall,
+ [SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
+ SDNPVariadic]>;
def riscv_sw_guarded_brind : SDNode<"RISCVISD::SW_GUARDED_BRIND",
SDTBrind, [SDNPHasChain]>;
def riscv_sllw : SDNode<"RISCVISD::SLLW", SDT_RISCVIntBinOpW>;
@@ -1555,10 +1561,15 @@ let Predicates = [NoStdExtZicfilp] in
def PseudoCALLIndirect : Pseudo<(outs), (ins GPRJALR:$rs1),
[(riscv_call GPRJALR:$rs1)]>,
PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
-let Predicates = [HasStdExtZicfilp] in
+let Predicates = [HasStdExtZicfilp] in {
def PseudoCALLIndirectNonX7 : Pseudo<(outs), (ins GPRJALRNonX7:$rs1),
- [(riscv_call GPRJALRNonX7:$rs1)]>,
+ [(riscv_call GPRJALRNonX7:$rs1)]>,
PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
+// For large code model, non-indirect calls could be software-guarded
+def PseudoCALLIndirectX7 : Pseudo<(outs), (ins GPRX7:$rs1),
+ [(riscv_sw_guarded_call GPRX7:$rs1)]>,
+ PseudoInstExpansion<(JALR X1, GPR:$rs1, 0)>;
+}
}
let isBarrier = 1, isReturn = 1, isTerminator = 1 in
@@ -1579,10 +1590,15 @@ let Predicates = [NoStdExtZicfilp] in
def PseudoTAILIndirect : Pseudo<(outs), (ins GPRTC:$rs1),
[(riscv_tail GPRTC:$rs1)]>,
PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
-let Predicates = [HasStdExtZicfilp] in
+let Predicates = [HasStdExtZicfilp] in {
def PseudoTAILIndirectNonX7 : Pseudo<(outs), (ins GPRTCNonX7:$rs1),
[(riscv_tail GPRTCNonX7:$rs1)]>,
PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
+// For large code model, non-indirect calls could be software-guarded
+def PseudoTAILIndirectX7 : Pseudo<(outs), (ins GPRX7:$rs1),
+ [(riscv_sw_guarded_tail GPRX7:$rs1)]>,
+ PseudoInstExpansion<(JALR X0, GPR:$rs1, 0)>;
+}
}
def : Pat<(riscv_tail (iPTR tglobaladdr:$dst)),
diff --git a/llvm/test/CodeGen/RISCV/calls.ll b/llvm/test/CodeGen/RISCV/calls.ll
index 598a026fb95526..48dfe453664a90 100644
--- a/llvm/test/CodeGen/RISCV/calls.ll
+++ b/llvm/test/CodeGen/RISCV/calls.ll
@@ -11,18 +11,29 @@
; RUN: | FileCheck -check-prefix=RV64I-MEDIUM %s
; RUN: llc -code-model=large -mtriple=riscv64 -verify-machineinstrs < %s \
; RUN: | FileCheck -check-prefix=RV64I-LARGE %s
+; RUN: llc -code-model=large -mtriple=riscv64 -mattr=experimental-zicfilp -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefix=RV64I-LARGE-ZICFILP %s
declare i32 @external_function(i32)
define i32 @test_call_external(i32 %a) nounwind {
-; CHECK-LABEL: test_call_external:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi sp, sp, -16
-; CHECK-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT: call external_function
-; CHECK-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT: addi sp, sp, 16
-; CHECK-NEXT: ret
+; RV32I-LABEL: test_call_external:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi sp, sp, -16
+; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT: call external_function
+; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT: addi sp, sp, 16
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: test_call_external:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi sp, sp, -16
+; RV32I-PIC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT: call external_function
+; RV32I-PIC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT: addi sp, sp, 16
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: test_call_external:
; RV64I: # %bb.0:
@@ -62,6 +73,19 @@ define i32 @test_call_external(i32 %a) nounwind {
; RV64I-LARGE-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64I-LARGE-NEXT: addi sp, sp, 16
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_external:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT: .Lpcrel_hi0:
+; RV64I-LARGE-ZICFILP-NEXT: auipc a1, %pcrel_hi(.LCPI0_0)
+; RV64I-LARGE-ZICFILP-NEXT: ld t2, %pcrel_lo(.Lpcrel_hi0)(a1)
+; RV64I-LARGE-ZICFILP-NEXT: jalr t2
+; RV64I-LARGE-ZICFILP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT: ret
%1 = call i32 @external_function(i32 %a)
ret i32 %1
}
@@ -69,14 +93,23 @@ define i32 @test_call_external(i32 %a) nounwind {
declare dso_local i32 @dso_local_function(i32)
define i32 @test_call_dso_local(i32 %a) nounwind {
-; CHECK-LABEL: test_call_dso_local:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi sp, sp, -16
-; CHECK-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT: call dso_local_function
-; CHECK-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT: addi sp, sp, 16
-; CHECK-NEXT: ret
+; RV32I-LABEL: test_call_dso_local:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi sp, sp, -16
+; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT: call dso_local_function
+; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT: addi sp, sp, 16
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: test_call_dso_local:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi sp, sp, -16
+; RV32I-PIC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT: call dso_local_function
+; RV32I-PIC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT: addi sp, sp, 16
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: test_call_dso_local:
; RV64I: # %bb.0:
@@ -116,15 +149,33 @@ define i32 @test_call_dso_local(i32 %a) nounwind {
; RV64I-LARGE-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64I-LARGE-NEXT: addi sp, sp, 16
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_dso_local:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT: .Lpcrel_hi1:
+; RV64I-LARGE-ZICFILP-NEXT: auipc a1, %pcrel_hi(.LCPI1_0)
+; RV64I-LARGE-ZICFILP-NEXT: ld t2, %pcrel_lo(.Lpcrel_hi1)(a1)
+; RV64I-LARGE-ZICFILP-NEXT: jalr t2
+; RV64I-LARGE-ZICFILP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT: ret
%1 = call i32 @dso_local_function(i32 %a)
ret i32 %1
}
define i32 @defined_function(i32 %a) nounwind {
-; CHECK-LABEL: defined_function:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi a0, a0, 1
-; CHECK-NEXT: ret
+; RV32I-LABEL: defined_function:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi a0, a0, 1
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: defined_function:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi a0, a0, 1
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: defined_function:
; RV64I: # %bb.0:
@@ -145,19 +196,34 @@ define i32 @defined_function(i32 %a) nounwind {
; RV64I-LARGE: # %bb.0:
; RV64I-LARGE-NEXT: addiw a0, a0, 1
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: defined_function:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addiw a0, a0, 1
+; RV64I-LARGE-ZICFILP-NEXT: ret
%1 = add i32 %a, 1
ret i32 %1
}
define i32 @test_call_defined(i32 %a) nounwind {
-; CHECK-LABEL: test_call_defined:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi sp, sp, -16
-; CHECK-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT: call defined_function
-; CHECK-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT: addi sp, sp, 16
-; CHECK-NEXT: ret
+; RV32I-LABEL: test_call_defined:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi sp, sp, -16
+; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT: call defined_function
+; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT: addi sp, sp, 16
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: test_call_defined:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi sp, sp, -16
+; RV32I-PIC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT: call defined_function
+; RV32I-PIC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT: addi sp, sp, 16
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: test_call_defined:
; RV64I: # %bb.0:
@@ -197,21 +263,45 @@ define i32 @test_call_defined(i32 %a) nounwind {
; RV64I-LARGE-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64I-LARGE-NEXT: addi sp, sp, 16
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_defined:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT: .Lpcrel_hi2:
+; RV64I-LARGE-ZICFILP-NEXT: auipc a1, %pcrel_hi(.LCPI3_0)
+; RV64I-LARGE-ZICFILP-NEXT: ld t2, %pcrel_lo(.Lpcrel_hi2)(a1)
+; RV64I-LARGE-ZICFILP-NEXT: jalr t2
+; RV64I-LARGE-ZICFILP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT: ret
%1 = call i32 @defined_function(i32 %a)
ret i32 %1
}
define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
-; CHECK-LABEL: test_call_indirect:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi sp, sp, -16
-; CHECK-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT: mv a2, a0
-; CHECK-NEXT: mv a0, a1
-; CHECK-NEXT: jalr a2
-; CHECK-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT: addi sp, sp, 16
-; CHECK-NEXT: ret
+; RV32I-LABEL: test_call_indirect:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi sp, sp, -16
+; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT: mv a2, a0
+; RV32I-NEXT: mv a0, a1
+; RV32I-NEXT: jalr a2
+; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT: addi sp, sp, 16
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: test_call_indirect:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi sp, sp, -16
+; RV32I-PIC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT: mv a2, a0
+; RV32I-PIC-NEXT: mv a0, a1
+; RV32I-PIC-NEXT: jalr a2
+; RV32I-PIC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT: addi sp, sp, 16
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: test_call_indirect:
; RV64I: # %bb.0:
@@ -256,6 +346,18 @@ define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
; RV64I-LARGE-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64I-LARGE-NEXT: addi sp, sp, 16
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_indirect:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT: mv a2, a0
+; RV64I-LARGE-ZICFILP-NEXT: mv a0, a1
+; RV64I-LARGE-ZICFILP-NEXT: jalr a2
+; RV64I-LARGE-ZICFILP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-NEXT: ret
%1 = call i32 %a(i32 %b)
ret i32 %1
}
@@ -263,22 +365,39 @@ define i32 @test_call_indirect(ptr %a, i32 %b) nounwind {
; Make sure we don't use t0 as the source for jalr as that is a hint to pop the
; return address stack on some microarchitectures.
define i32 @test_call_indirect_no_t0(ptr %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h) nounwind {
-; CHECK-LABEL: test_call_indirect_no_t0:
-; CHECK: # %bb.0:
-; CHECK-NEXT: addi sp, sp, -16
-; CHECK-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-NEXT: mv t1, a0
-; CHECK-NEXT: mv a0, a1
-; CHECK-NEXT: mv a1, a2
-; CHECK-NEXT: mv a2, a3
-; CHECK-NEXT: mv a3, a4
-; CHECK-NEXT: mv a4, a5
-; CHECK-NEXT: mv a5, a6
-; CHECK-NEXT: mv a6, a7
-; CHECK-NEXT: jalr t1
-; CHECK-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-NEXT: addi sp, sp, 16
-; CHECK-NEXT: ret
+; RV32I-LABEL: test_call_indirect_no_t0:
+; RV32I: # %bb.0:
+; RV32I-NEXT: addi sp, sp, -16
+; RV32I-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-NEXT: mv t1, a0
+; RV32I-NEXT: mv a0, a1
+; RV32I-NEXT: mv a1, a2
+; RV32I-NEXT: mv a2, a3
+; RV32I-NEXT: mv a3, a4
+; RV32I-NEXT: mv a4, a5
+; RV32I-NEXT: mv a5, a6
+; RV32I-NEXT: mv a6, a7
+; RV32I-NEXT: jalr t1
+; RV32I-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-NEXT: addi sp, sp, 16
+; RV32I-NEXT: ret
+;
+; RV32I-PIC-LABEL: test_call_indirect_no_t0:
+; RV32I-PIC: # %bb.0:
+; RV32I-PIC-NEXT: addi sp, sp, -16
+; RV32I-PIC-NEXT: sw ra, 12(sp) # 4-byte Folded Spill
+; RV32I-PIC-NEXT: mv t1, a0
+; RV32I-PIC-NEXT: mv a0, a1
+; RV32I-PIC-NEXT: mv a1, a2
+; RV32I-PIC-NEXT: mv a2, a3
+; RV32I-PIC-NEXT: mv a3, a4
+; RV32I-PIC-NEXT: mv a4, a5
+; RV32I-PIC-NEXT: mv a5, a6
+; RV32I-PIC-NEXT: mv a6, a7
+; RV32I-PIC-NEXT: jalr t1
+; RV32I-PIC-NEXT: lw ra, 12(sp) # 4-byte Folded Reload
+; RV32I-PIC-NEXT: addi sp, sp, 16
+; RV32I-PIC-NEXT: ret
;
; RV64I-LABEL: test_call_indirect_no_t0:
; RV64I: # %bb.0:
@@ -347,6 +466,24 @@ define i32 @test_call_indirect_no_t0(ptr %a, i32 %b, i32 %c, i32 %d, i32 %e, i32
; RV64I-LARGE-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
; RV64I-LARGE-NEXT: addi sp, sp, 16
; RV64I-LARGE-NEXT: ret
+;
+; RV64I-LARGE-ZICFILP-LABEL: test_call_indirect_no_t0:
+; RV64I-LARGE-ZICFILP: # %bb.0:
+; RV64I-LARGE-ZICFILP-NEXT: lpad 0
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, -16
+; RV64I-LARGE-ZICFILP-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64I-LARGE-ZICFILP-NEXT: mv t1, a0
+; RV64I-LARGE-ZICFILP-NEXT: mv a0, a1
+; RV64I-LARGE-ZICFILP-NEXT: mv a1, a2
+; RV64I-LARGE-ZICFILP-NEXT: mv a2, a3
+; RV64I-LARGE-ZICFILP-NEXT: mv a3, a4
+; RV64I-LARGE-ZICFILP-NEXT: mv a4, a5
+; RV64I-LARGE-ZICFILP-NEXT: mv a5, a6
+; RV64I-LARGE-ZICFILP-NEXT: mv a6, a7
+; RV64I-LARGE-ZICFILP-NEXT: jalr t1
+; RV64I-LARGE-ZICFILP-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64I-LARGE-ZICFILP-NEXT: addi sp, sp, 16
+; RV64I-LARGE-ZICFILP-...
[truncated]
|
jaidTw
force-pushed
the
jaidtw/riscv-large-sw-guarded
branch
2 times, most recently
from
September 20, 2024 06:15
a5aaed1
to
280efae
Compare
topperc
reviewed
Sep 20, 2024
jrtc27
reviewed
Sep 20, 2024
jrtc27
reviewed
Sep 20, 2024
jaidTw
force-pushed
the
jaidtw/riscv-large-sw-guarded
branch
from
September 22, 2024 05:05
fa2801c
to
d0a1734
Compare
topperc
approved these changes
Sep 24, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sterling-Augustine
pushed a commit
to Sterling-Augustine/llvm-project
that referenced
this pull request
Sep 27, 2024
Support for large code model are added recently, and sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal with Zicfilp because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered. Change direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)
xgupta
pushed a commit
to xgupta/llvm-project
that referenced
this pull request
Oct 4, 2024
Support for large code model are added recently, and sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal with Zicfilp because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered. Change direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Recently #70308 added support for large code model, with Zicfilp enabled, sementically direct calls are lowered to an indirect branch with a constant pool target. By default it does not use the x7 register and this is suboptimal because it introduces landing pad check, which is unnecessary since the constant pool is read-only and unlikely to be tampered.
This patch changes direct calls and tail calls to use x7 as the scratch register (a.k.a. software guarded branch in the CFI spec)