AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard #117262

arsenm · 2024-11-21T23:06:57Z

Increase from 11 wait states to 19

arsenm · 2024-11-21T23:07:36Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2024-11-21T23:08:56Z

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Increase from 11 wait states to 19

Full diff: https://github.com/llvm/llvm-project/pull/117262.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+7-3)
(modified) llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir (+21-7)

diff --git a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
index 44afccb0690d0d..99a176731599cc 100644
--- a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
@@ -2603,6 +2603,7 @@ int GCNHazardRecognizer::checkMAIVALUHazards(MachineInstr *MI) {
     const int DMFMA16x16WriteVgprMemExpReadWaitStates = 18;
     const int DMFMA4x4WriteVgprVALUReadWaitStates = 6;
     const int DMFMA16x16WriteVgprVALUReadWaitStates = 11;
+    const int GFX950_DMFMA16x16WriteVgprVALUReadWaitStates = 19;
     const int DotWriteSameDotReadSrcAB = 3;
     const int DotWriteDifferentVALURead = 3;
     const int DMFMABetweenVALUWriteVMEMRead = 2;
@@ -2663,9 +2664,12 @@ int GCNHazardRecognizer::checkMAIVALUHazards(MachineInstr *MI) {
           break;
         case 8:
         case 16:
-          NeedWaitStates = IsMemOrExport
-                               ? DMFMA16x16WriteVgprMemExpReadWaitStates
-                               : DMFMA16x16WriteVgprVALUReadWaitStates;
+          NeedWaitStates =
+              IsMemOrExport
+                  ? DMFMA16x16WriteVgprMemExpReadWaitStates
+                  : (ST.hasGFX950Insts()
+                         ? GFX950_DMFMA16x16WriteVgprVALUReadWaitStates
+                         : DMFMA16x16WriteVgprVALUReadWaitStates);
           break;
         default:
           llvm_unreachable("unexpected dgemm");
diff --git a/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir b/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir
index 9681b01f334f9a..d2b2f226404da8 100644
--- a/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir
+++ b/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx940.mir
@@ -1,4 +1,5 @@
-# RUN: llc -mtriple=amdgcn -mcpu=gfx940 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefix=GCN %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx940 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX940 %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx950 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX950 %s
 
 # GCN-LABEL: name: valu_write_vgpr_sgemm_mfma_read
 # GCN:      V_MOV_B32
@@ -803,8 +804,12 @@ body:             |
 ...
 # GCN-LABEL: name: dmfma16x16_write_vgpr_valu_read
 # GCN:      V_MFMA
-# GCN-NEXT: S_NOP 7
-# GCN-NEXT: S_NOP 2
+# GFX940-NEXT: S_NOP 7
+# GFX940-NEXT: S_NOP 2
+
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 2
 # GCN-NEXT: V_MOV_B32
 name:            dmfma16x16_write_vgpr_valu_read
 body:             |
@@ -867,8 +872,13 @@ body:             |
 ...
 # GCN-LABEL: name: dmfma16x16_write_vgpr_dot_read
 # GCN:      V_MFMA
-# GCN-NEXT: S_NOP 7
-# GCN-NEXT: S_NOP 2
+# GFX940-NEXT: S_NOP 7
+# GFX940-NEXT: S_NOP 2
+
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 2
+
 # GCN-NEXT: V_DOT
 name:            dmfma16x16_write_vgpr_dot_read
 body:             |
@@ -1505,8 +1515,12 @@ body:             |
 ...
 # GCN-LABEL: name: dmfma16x16_write_agpr_valu_read
 # GCN:      V_MFMA
-# GCN-NEXT: S_NOP 7
-# GCN-NEXT: S_NOP 2
+# GFX940-NEXT: S_NOP 7
+# GFX940-NEXT: S_NOP 2
+
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 7
+# GFX950-NEXT: S_NOP 2
 # GCN-NEXT: V_ACCVGPR_READ_B32_e64
 name:            dmfma16x16_write_agpr_valu_read
 body:             |

arsenm · 2024-11-23T04:08:34Z

Merge activity

Nov 22, 11:08 PM EST: A user started a stack merge that includes this pull request via Graphite.
Nov 22, 11:18 PM EST: Graphite rebased this pull request as part of a merge.
Nov 22, 11:20 PM EST: A user merged this pull request with Graphite.

Increase from 11 wait states to 19

arsenm mentioned this pull request Nov 21, 2024

AMDGPU: Define new sched model for gfx950 #117261

Merged

arsenm added the backend:AMDGPU label Nov 21, 2024 — with Graphite App

arsenm requested review from jayfoad, pravinjagtap, rampitec, shiltian, Sisyph and srpande November 21, 2024 23:07

arsenm marked this pull request as ready for review November 21, 2024 23:07

arsenm force-pushed the users/arsenm/gfx950/mfma_f64_16x16x4_valu_hazard branch from 0f57e4c to a444594 Compare November 22, 2024 21:10

shiltian approved these changes Nov 23, 2024

View reviewed changes

arsenm force-pushed the users/arsenm/gfx950/sched-model branch from 6863958 to 5e2a2ec Compare November 23, 2024 04:13

Base automatically changed from users/arsenm/gfx950/sched-model to main November 23, 2024 04:17

AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard

979aa0b

Increase from 11 wait states to 19

arsenm force-pushed the users/arsenm/gfx950/mfma_f64_16x16x4_valu_hazard branch from a444594 to 979aa0b Compare November 23, 2024 04:18

arsenm merged commit b078b88 into main Nov 23, 2024
5 of 8 checks passed

arsenm deleted the users/arsenm/gfx950/mfma_f64_16x16x4_valu_hazard branch November 23, 2024 04:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard #117262

AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard #117262

arsenm commented Nov 21, 2024

arsenm commented Nov 21, 2024 •

edited

Loading

llvmbot commented Nov 21, 2024

arsenm commented Nov 23, 2024 •

edited

Loading

AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard #117262

AMDGPU: Handle gfx950 change in mfma_f64_16x16x4 + valu hazard #117262

Conversation

arsenm commented Nov 21, 2024

arsenm commented Nov 21, 2024 • edited Loading

llvmbot commented Nov 21, 2024

arsenm commented Nov 23, 2024 • edited Loading

Merge activity

arsenm commented Nov 21, 2024 •

edited

Loading

arsenm commented Nov 23, 2024 •

edited

Loading