Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE2] Support for allowing direct VEXTRACT to 20-bit registers #233

Merged
merged 3 commits into from
Feb 27, 2025

Conversation

abhinay-anubola
Copy link
Collaborator

@abhinay-anubola abhinay-anubola commented Nov 8, 2024

  • This update introduces a new generic combiner that simplifies the sequence sext(trunc x) directly to x when applicable.
  • Added VExtract combiner that enables above generic combiner, thus we have 20-bit vextract.
  • The MachineVerifier has been updated to allow G_AIE_SEXT_EXTRACT_VECTOR_ELT and G_AIE_ZEXT_EXTRACT_VECTOR_ELT to accept 20-bit outputs.
  • Additionally, tests have been added and updated to reflect these functional changes.

@krishnamtibrewala
Copy link
Collaborator

Given that you mentioned there are no QoR gain, I would recommend you to re look at the instruction that consume S20 type reg.
Because for the optimization starts to trace back from an instruction that consumes S20 type which might not be captured in isNativeS20Consumer function.

@abhinay-anubola abhinay-anubola force-pushed the sanubola.support.20bit.VEXTRACT branch from 650d8a9 to d5d7cf0 Compare November 12, 2024 11:21
@abhinay-anubola abhinay-anubola force-pushed the sanubola.support.20bit.VEXTRACT branch from d5d7cf0 to f12f1a4 Compare November 14, 2024 09:13
MIRBuilder.buildAssertInstr(AssertExtOpcode, ExtReg20Bit, DstReg20Bit,
SrcEltSize);
MIRBuilder.buildInstr(ExtOpcode, {DstReg}, {ExtReg20Bit});
MI.eraseFromParent();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we are safe ;-)

@abhinay-anubola abhinay-anubola force-pushed the sanubola.support.20bit.VEXTRACT branch from 000b09e to e7731e6 Compare December 17, 2024 09:08
@@ -192,7 +190,6 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
; RV64I-PIC-LABEL: above_threshold:
; RV64I-PIC: # %bb.0: # %entry
; RV64I-PIC-NEXT: li a2, 5
; RV64I-PIC-NEXT: sext.w a0, a0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Did you check all targets?

Copy link
Collaborator Author

@abhinay-anubola abhinay-anubola Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have checked all targets.

/// To : %9:_(s20) = G_AIE_SEXT_EXTRACT_VECTOR_ELT %2(<32 x s16>), %0(s32)
/// %10:_(s20) = G_ASSERT_[S/Z]EXT %9, 16
/// %4:_(s16) = G_TRUNC %10(s20)
/// %5:_(s20) = G_[S/Z]EXT %4(s16)
Copy link
Collaborator

@gbossu gbossu Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to change the return types? I would expect that we only need to add a %10:_(s32) = G_ASSERT_[S/Z]EXT %9, 16 and keep the rest intact thanks to the new sext(trunc x) combiner you added previously.

Copy link
Collaborator Author

@abhinay-anubola abhinay-anubola Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to change the return types, because the pattern that is written in new sext(trunc x) combiner will not match in this case as m_SpecificType is trying to match s20 but return type here is s32.
mi_match(SrcReg, MRI, m_GTrunc(m_all_of(m_Reg(Reg), m_SpecificType(DstTy))))

@abhinay-anubola abhinay-anubola force-pushed the sanubola.support.20bit.VEXTRACT branch from e7731e6 to 467672d Compare January 2, 2025 10:54
andcarminati
andcarminati previously approved these changes Jan 2, 2025
Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

%var:_(s32) = COPY $vgpr0
%assert:_(s32) = G_ASSERT_SEXT %var, 16
%trunc:_(s16) = G_TRUNC %assert(s32)
%sext:_(s64) = G_SEXT %trunc(s16)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I guess we could still eliminate the G_TRUNC, and change G_SEXT to extend from 32 -> 64?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This story requires a (sext (trunc x)) → x combiner; we can create a new ticket for the G_TRUNC elimination

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed 👍

@abhinay-anubola abhinay-anubola force-pushed the sanubola.support.20bit.VEXTRACT branch from c3cba3e to fcf4285 Compare February 24, 2025 10:04
@abhinay-anubola abhinay-anubola changed the title Support for allowing direct VEXTRACT to 20-bit registers [AIE2] Support for allowing direct VEXTRACT to 20-bit registers Feb 24, 2025
@@ -2465,6 +2465,20 @@ bool CombinerHelper::matchCombineZextTrunc(MachineInstr &MI, Register &Reg) {
return false;
}

bool CombinerHelper::matchCombineSextTrunc(MachineInstr &MI, Register &Reg) {
assert(MI.getOpcode() == TargetOpcode::G_SEXT && "Expected a G_SEXT");
const Register DstReg = MI.getOperand(0).getReg();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could use const auto [DstReg, DstTy, SrcReg, SrcTy] = MI.getFirst2RegLLTs()

%var:_(s32) = COPY $vgpr0
%assert:_(s32) = G_ASSERT_SEXT %var, 16
%trunc:_(s16) = G_TRUNC %assert(s32)
%sext:_(s64) = G_SEXT %trunc(s16)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed 👍

@abhinay-anubola abhinay-anubola merged commit 9a67e50 into aie-public Feb 27, 2025
6 checks passed
@abhinay-anubola abhinay-anubola deleted the sanubola.support.20bit.VEXTRACT branch February 27, 2025 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants