Convert Loop Metadata to Asserts to help Loop rotation #227

F-Stuckmann · 2024-10-28T16:55:15Z

note the first commit will be removed once the ( #225 ) is merged

Please note: I am still going through QoR to make sure I am not introducing any functional errors.

llvm/include/llvm/Transforms/Utils/LoopMetadata.h

llvm/lib/Transforms/Utils/LoopMetadata.cpp

llvm/include/llvm/Transforms/Utils/LoopMetadata.h

llvm/lib/Transforms/Utils/LoopMetadata.cpp

llvm/include/llvm/Transforms/Utils/LoopMetadata.h

llvm/lib/Transforms/Utils/LoopMetadata.cpp

llvm/test/Transforms/Util/loop-metadata.ll

llvm/include/llvm/Transforms/Utils/LoopMetadata.h

llvm/lib/Transforms/Utils/LoopMetadata.cpp

llvm/include/llvm/Transforms/Utils/LoopMetadata.h

llvm/lib/Transforms/Utils/LoopMetadata.cpp

llvm/lib/Passes/PassBuilderPipelines.cpp

andcarminati · 2024-11-22T09:41:05Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+    return std::nullopt;
+  }
+
+  return std::optional(BackedgeCount);


If the loop is already rotated, is there any benefit for us? I mean, we can influence changes in the CFG as you mentioned before (fusing BBs). In this way, I am curious about practical effects (QoR).

I am just pointing this because loop rotation is an premise in several parts of this PR. Could we have a command line option to control if we would like to touch rotated loops and evaluate practical results?

we do not have any do-while loops that have loop iteration count pragmas attached. Therefore, we do not have any regressions.

I am just not sure if we need this, because if our goal is "to help loop rotation" we are inserting unnecessary information. Even if we have such type of loops, assume will add a known information.

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

gbossu · 2024-11-26T12:40:33Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+  // expression can be simplified in later passes
+  SCEVCouldNotCompute::NoWrapFlags NWF = BunldedSCEV.AddRecExpr->getNoWrapFlags(
+      static_cast<SCEVCouldNotCompute::NoWrapFlags>(SCEV::FlagNUW |
+                                                    SCEV::FlagNSW));


Curious: SCEV also mentions "self-wrap", do you know what it means?

It is a flag that determines that no overflow will occur, i.e. if either NUW or NSW are set.

In our pass we need to determine, if we have a signed or unsigned no-overflow flag to duplicate it for the Bound arithmetic. Therefore the NW flag does not provide any useful information.

/// AddRec expressions may have a no-self-wraparound property if, in
/// the integer domain, abs(step) * max-iteration(loop) <=
/// unsigned-max(bitwidth). This means that the recurrence will never reach
/// its start value if the step is non-zero.

/// Note that NUW and NSW are also valid properties of a recurrence, and
/// either implies NW. For convenience, NW will be set for a recurrence
/// whenever either NUW or NSW are set.

gbossu · 2024-11-26T12:41:03Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+
+  // Expansion of MinIterSCEV will result in                                   \
+  // InitialValue + StepSize * Minimum Iteration Count
+  if (!Expander.isSafeToExpand(MinIterSCEV)) {


Curious: in which condition does it happen?

This happens in decrementing loops:

for (unsigned i=Bound; i > 0; i ++)

or loop with initial values, that are not constants.
for (unsigned i = InitialValue; i < Bound; i++)
In the case of constants, MinIterSCEV evaluates to a constant that is just inserted.

The corresponding unit tests:

incrementByOne decrementByOne decrementGEZero

The counter case, where MinIterSCEV evaluates to a constant is
incrementInclusiveBound

gbossu · 2024-11-26T13:12:00Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+                                                    SCEV::FlagNSW));
+  auto *AE = dyn_cast<SCEVAddExpr>(MinIterSCEV);
+  if (AE && NWF) {
+    MinIterSCEV = SE.getAddExpr(AE->getOperand(0), AE->getOperand(1), NWF);


Essentially this is taking the same expression and adding flags? Isn't there a helper to add them without re-creating the SCEV? Do you also know why those flags aren't automatically propagated from BunldedSCEV.AddRecExpr? After all, MinIterSCEV is that same expression, but evaluated at a different point.

Yeah, I can add the overflow flags without recreating the expression, I generalized it to add the flags for any 'SCEVCommutativeExpr'

Unfortunately SCEVAddRecExpr::evaluateAtIteration does not have a way to automatically reuse the no overflow flags.

gbossu · 2024-11-26T13:12:13Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+  SCEVCouldNotCompute::NoWrapFlags NWF = BunldedSCEV.AddRecExpr->getNoWrapFlags(
+      static_cast<SCEVCouldNotCompute::NoWrapFlags>(SCEV::FlagNUW |
+                                                    SCEV::FlagNSW));
+  auto *AE = dyn_cast<SCEVAddExpr>(MinIterSCEV);


Could it be something else than SCEVAddExpr?

yes, I generalized it can be any SCEVCommutativeExpr , which can be an AddExpr, MulExpr or SCEVMinMaxExpr

And for those it does not make sense to add nowrap flags?

But the base expression is a AddExpr, how can it become e.g. MulExpr or SCEVMinMaxExpr?

Clarified offline and "confirmed" in https://www.npopov.com/2023/10/03/LLVM-Scalar-evolution.html: We should just get a new AddRec with a different initial value.

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

gbossu · 2024-11-26T14:17:26Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+      dyn_cast<SCEVCommutativeExpr>(MinIterSCEV);
+  if (ConstCE && NWF) {
+    SCEVCommutativeExpr *CE = const_cast<SCEVCommutativeExpr *>(ConstCE);
+    CE->setNoWrapFlags(NWF);


Hmm, that might be dangerous to const_cast an expression and change its properties? (It might be cached somewhere and re-used) Is there a way to get a non-const SCEV from evaluateAtIteration()?

I unfortunately get a const SCEV expression from evaluateAtIteration()

F-Stuckmann · 2024-11-28T14:49:55Z

Changed the Logic to evaluate the Compare Instruction at Iteration 0 and set it to true, which solves all of our issues.

Moved the Assumption into the preheader (with respective clones) to guarantee, that the assumption is loop invariant and does not change even if both Operands of the Compare are loop variant.

Added more unittests to test more corner cases.

gbossu · 2024-12-02T09:40:43Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

@@ -107,6 +107,11 @@ Value *recursivlyCloneInBB(Value *Op, BasicBlock &BB, ValueToValueMapTy &VMap) {
                      << *Op << "\n");
    return nullptr;
  }
+  if (!I->isSafeToRemove()) {


What is this checking?

This is a general checker, if we are able to clone the instruction into a different BB, without introducing behavior that could cause issues (should be in conjunction with the volatile memory check). The checks include:

if it is a call instruction

if we may write to memory

it an exception is thrown or if it is an exception handler

basically, combined with volatile this is a very conservative hoisting check

gbossu · 2024-12-02T09:50:19Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+  if (!AddRec) {
+    LLVM_DEBUG(dbgs() << "Could not extract AddRecExpr, will reuse " << *Op
+                      << "\n");
+    return Op;


I think this is dangerous: we might return a LoadInst that isn't loop-invariant. What we return here must be loop-invariant because we will later clone it.

E.g. we could have

loop.header: store ... %iv = load ...

The store location could alias with the load location.

So we might want to run LICM beforehand, or use analysis like MemorySSA or LoopAccessAnalysis to help us say what is loop invariant.

Also note LoopUtils has a bunch of things to help you. In particular:

/// Returns true if is legal to hoist or sink this instruction disregarding the /// possible introduction of faults. Reasoning about potential faulting /// instructions is the responsibility of the caller since it is challenging to /// do efficiently from within this routine. /// \p TargetExecutesOncePerLoop is true only when it is guaranteed that the /// target executes at most once per execution of the loop body. This is used /// to assess the legality of duplicating atomic loads. Generally, this is /// true when moving out of loop and not true when moving into loops. /// If \p ORE is set use it to emit optimization remarks. bool canSinkOrHoistInst(Instruction &I, AAResults *AA, DominatorTree *DT, Loop *CurLoop, MemorySSAUpdater &MSSAU, bool TargetExecutesOncePerLoop, SinkAndHoistLICMFlags &LICMFlags, OptimizationRemarkEmitter *ORE = nullptr);

I switched to canSinkOrHoistInst check

gbossu · 2024-12-02T09:52:04Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+  if (I->isVolatile()) {
+    LLVM_DEBUG(dbgs() << "Volatile Instruction: Aborting! Could not clone "
+                      << *Op << "\n");
+    return nullptr;


As said before, I think we need to check more than just volatile, we cannot just clone any instruction without checking it is loop-invariant. Or at least we need to prove that the instructions that precede it will have no effects on its value.

gbossu · 2024-12-02T09:53:55Z

llvm/lib/Transforms/Utils/LoopIterCountAssumptions.cpp

+    LLVM_DEBUG(dbgs() << "Loop already in rotated form. Will not add Loop "
+                         "Iteration Count assumptions.\n");
+    return nullptr;
+  }


Checking: that could be an already-rotated loop with a guard? I assume we could potentially remove the guard if we add a builtin_assume? Nothing to do for now, I think it's good to focus on standard for loops :)

No.
The case to check against is a do-while loop with iteration count metadata.
Since this is already rotated, we don't need to add any assumptions, because our pass is redundant.

we are so early in the optimization pipeline, we should never see an already rotated loop

…for AIE targets

F-Stuckmann · 2024-12-03T10:57:59Z

Only add assumption if the assumption at the first iteration can be guaranteed to be loop invariant.

F-Stuckmann · 2024-12-03T10:59:17Z

rebased to aie-public

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/test/Transforms/Util/loop-metadata.ll Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/include/llvm/Transforms/Utils/LoopMetadata.h Outdated Show resolved Hide resolved

gbossu reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Oct 29, 2024

View reviewed changes

llvm/lib/Transforms/Utils/LoopMetadata.cpp Outdated Show resolved Hide resolved

martien-de-jong reviewed Oct 29, 2024

View reviewed changes