[InstCombine] Fold align assume into load's !align metadata if possible. #108958

fhahn · 2024-09-17T11:02:22Z

If an alignment assumption is valid in the context of a corresponding load of the pointer the assumption applies to, the assumption can be replaced !align metadata on the load.

The benefits of folding it into !align are that existing code makes better use of !align and it allows removing the now-redundant call instructions.

llvmbot · 2024-09-17T11:02:56Z

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

If an alignment assumption is valid in the context of a corresponding load of the pointer the assumption applies to, the assumption can be replaced !align metadata on the load.

The benefits of folding it into !align are that existing code makes better use of !align and it allows removing the now-redundant call instructions.

Full diff: https://github.com/llvm/llvm-project/pull/108958.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+21-4)
(modified) llvm/test/Transforms/InstCombine/assume-align.ll (+4-3)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 61011d55227e7b..596de10e20b6de 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -3076,12 +3076,13 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
       // TODO: apply range metadata for range check patterns?
     }
 
-    // Separate storage assumptions apply to the underlying allocations, not any
-    // particular pointer within them. When evaluating the hints for AA purposes
-    // we getUnderlyingObject them; by precomputing the answers here we can
-    // avoid having to do so repeatedly there.
     for (unsigned Idx = 0; Idx < II->getNumOperandBundles(); Idx++) {
       OperandBundleUse OBU = II->getOperandBundleAt(Idx);
+
+      // Separate storage assumptions apply to the underlying allocations, not any
+      // particular pointer within them. When evaluating the hints for AA purposes
+      // we getUnderlyingObject them; by precomputing the answers here we can
+      // avoid having to do so repeatedly there.
       if (OBU.getTagName() == "separate_storage") {
         assert(OBU.Inputs.size() == 2);
         auto MaybeSimplifyHint = [&](const Use &U) {
@@ -3095,6 +3096,22 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
         MaybeSimplifyHint(OBU.Inputs[0]);
         MaybeSimplifyHint(OBU.Inputs[1]);
       }
+
+      // Try to fold alignment assumption into a load's !align metadata, if the assumption is valid in the load's context.
+      if (OBU.getTagName() == "align" && OBU.Inputs.size() == 2) {
+        auto *LI = dyn_cast<LoadInst>(OBU.Inputs[0]);
+        if (!LI || !isValidAssumeForContext(II, LI, &DT, /*AllowEphemerals=*/true))
+          continue;
+        auto *Align = cast<ConstantInt>(OBU.Inputs[1]);
+        if (!isPowerOf2_64(Align->getZExtValue()))
+          continue;
+        LI->setMetadata(LLVMContext::MD_align,
+                        MDNode::get(II->getContext(),
+                                    ValueAsMetadata::getConstant(
+                                        Align)));
+        auto *New = CallBase::removeOperandBundle(II, OBU.getTagID());
+        return New;
+      }
     }
 
     // Convert nonnull assume like:
diff --git a/llvm/test/Transforms/InstCombine/assume-align.ll b/llvm/test/Transforms/InstCombine/assume-align.ll
index 2b8ca5d25fd1a8..65256377696a59 100644
--- a/llvm/test/Transforms/InstCombine/assume-align.ll
+++ b/llvm/test/Transforms/InstCombine/assume-align.ll
@@ -123,11 +123,9 @@ define i8 @assume_align_non_pow2(ptr %p) {
   ret i8 %v
 }
 
-; TODO: Can fold alignment assumption into !align metadata on load.
 define ptr @fold_assume_align_pow2_of_loaded_pointer_into_align_metadata(ptr %p) {
 ; CHECK-LABEL: @fold_assume_align_pow2_of_loaded_pointer_into_align_metadata(
-; CHECK-NEXT:    [[P2:%.*]] = load ptr, ptr [[P:%.*]], align 8
-; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(ptr [[P2]], i64 8) ]
+; CHECK-NEXT:    [[P2:%.*]] = load ptr, ptr [[P:%.*]], align 8, !align [[META0:![0-9]+]]
 ; CHECK-NEXT:    ret ptr [[P2]]
 ;
   %p2 = load ptr, ptr %p
@@ -171,3 +169,6 @@ define ptr @dont_fold_assume_align_zero_of_loaded_pointer_into_align_metadata(pt
   call void @llvm.assume(i1 true) [ "align"(ptr %p2, i64 0) ]
   ret ptr %p2
 }
+;.
+; CHECK: [[META0]] = !{i64 8}
+;.

github-actions · 2024-09-17T11:05:59Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Missing information about begin and end pointers of std::vector can lead to missed optimizations in LLVM. See llvm#101372 for a discussion of missed range check optimizations in hardened mode. Once llvm#108958 lands, the created `llvm.assume` calls for the alignment should be folded into the `load` instructions, resulting in no extra instructions after InstCombine.

dtcxzyw · 2024-09-17T11:55:20Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

+        auto *Align = cast<ConstantInt>(OBU.Inputs[1]);
+        if (!isPowerOf2_64(Align->getZExtValue()))


Suggested change

auto *Align = cast<ConstantInt>(OBU.Inputs[1]);

if (!isPowerOf2_64(Align->getZExtValue()))

auto *Align = dyn_cast<ConstantInt>(OBU.Inputs[1]);

if (!Align || !isPowerOf2_64(Align->getZExtValue()))

While attributes expect constant arguments, assume operand bundles may be provided a dynamic value, for example:

call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]

Can we use getKnowledgeFromBundle here?

Yes thanks! Didn't check that they also allow dynamic values.

(also added a check of the type size + test)

Can we use getKnowledgeFromBundle here?

Missed this earlier, updated now, thanks!

nikic · 2024-09-17T12:16:49Z

Based on https://github.com/dtcxzyw/llvm-opt-benchmark/pull/1320/files it looks like we end up losing the alignment information in ~most cases? Presumably because SROA later comes along and removes the load.

Note that for the existing non-null handling, SROA will actually rematerialize the nonnull assumption. But I'm reasonably confident that doing that for !align would have terrible effects, at least for frontends that use !align a lot.

nikic · 2024-09-17T12:17:34Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

+          continue;
+        LI->setMetadata(
+            LLVMContext::MD_align,
+            MDNode::get(II->getContext(), ValueAsMetadata::getConstant(Align)));


Should also add noundef to preserve the fact this is IUB.

dtcxzyw · 2024-09-17T15:39:25Z

Based on https://github.com/dtcxzyw/llvm-opt-benchmark/pull/1320/files it looks like we end up losing the alignment information in ~most cases? Presumably because SROA later comes along and removes the load.

Note that for the existing non-null handling, SROA will actually rematerialize the nonnull assumption. But I'm reasonably confident that doing that for !align would have terrible effects, at least for frontends that use !align a lot.

PhaseOrdering reproducer:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write)
declare void @llvm.assume(i1 noundef) #0

define i32 @_ZN4llvm7support6endian8readNextIjLNS_10endiannessE1ELm0EhEET_RPKT2_(ptr %0) {
  %2 = call i32 @_ZN4llvm7support6endian8readNextIjLm0EhEET_RPKT1_NS_10endiannessE(ptr %0)
  ret i32 0
}

define i32 @_ZN4llvm7support6endian8readNextIjLm0EhEET_RPKT1_NS_10endiannessE(ptr %0) {
  %2 = load ptr, ptr %0, align 8
  %3 = call i32 @_ZN4llvm7support6endian4readIjLm0EEET_PKvNS_10endiannessE(ptr %2)
  %4 = load ptr, ptr %0, align 8
  %5 = getelementptr i8, ptr %4, i64 4
  store ptr %5, ptr %0, align 8
  ret i32 0
}

define i32 @_ZN4llvm7support6endian4readIjLm0EEET_PKvNS_10endiannessE(ptr %0) {
  call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 4) ]
  %.0.copyload = load i32, ptr %0, align 1
  %2 = call i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32 %.0.copyload)
  ret i32 0
}

declare i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32)

define i64 @_ZN4llvm22OnDiskChainedHashTableIN12_GLOBAL__N_126IdentifierIndexReaderTraitEE24readNumBucketsAndEntriesERPKh(ptr %0) {
  %2 = call i32 @_ZN4llvm7support6endian8readNextIjLNS_10endiannessE1ELm0EhEET_RPKT2_(ptr %0)
  %3 = call i32 @_ZN4llvm7support6endian8readNextIjLNS_10endiannessE1ELm0EhEET_RPKT2_(ptr %0)
  ret i64 0
}

attributes #0 = { nocallback nofree nosync nounwind willreturn memory(inaccessiblemem: write) }

Before:

define noundef i64 @_ZN4llvm22OnDiskChainedHashTableIN12_GLOBAL__N_126IdentifierIndexReaderTraitEE24readNumBucketsAndEntriesERPKh(ptr nocapture %0) local_unnamed_addr {
  %2 = load ptr, ptr %0, align 8
  call void @llvm.assume(i1 true) [ "align"(ptr %2, i64 4) ]
  %.0.copyload.i.i.i = load i32, ptr %2, align 4
  %3 = tail call i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32 %.0.copyload.i.i.i)
  %4 = load ptr, ptr %0, align 8
  %5 = getelementptr i8, ptr %4, i64 4
  store ptr %5, ptr %0, align 8
  call void @llvm.assume(i1 true) [ "align"(ptr %5, i64 4) ]
  %.0.copyload.i.i.i1 = load i32, ptr %5, align 4
  %6 = tail call i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32 %.0.copyload.i.i.i1)
  %7 = load ptr, ptr %0, align 8
  %8 = getelementptr i8, ptr %7, i64 4
  store ptr %8, ptr %0, align 8
  ret i64 0
}

After:

define noundef i64 @_ZN4llvm22OnDiskChainedHashTableIN12_GLOBAL__N_126IdentifierIndexReaderTraitEE24readNumBucketsAndEntriesERPKh(ptr nocapture %0) local_unnamed_addr {
  %2 = load ptr, ptr %0, align 8, !align !0
  %.0.copyload.i.i.i = load i32, ptr %2, align 4
  %3 = tail call i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32 %.0.copyload.i.i.i)
  %4 = load ptr, ptr %0, align 8
  %5 = getelementptr i8, ptr %4, i64 4
  store ptr %5, ptr %0, align 8
  %.0.copyload.i.i.i1 = load i32, ptr %5, align 1
  %6 = tail call i32 @_ZN4llvm7support6endian9byte_swapIjEET_S3_NS_10endiannessE(i32 %.0.copyload.i.i.i1)
  %7 = load ptr, ptr %0, align 8
  %8 = getelementptr i8, ptr %7, i64 4
  store ptr %8, ptr %0, align 8
  ret i64 0
}

fhahn · 2024-09-17T20:27:26Z

Based on https://github.com/dtcxzyw/llvm-opt-benchmark/pull/1320/files it looks like we end up losing the alignment information in ~most cases? Presumably because SROA later comes along and removes the load.

Note that for the existing non-null handling, SROA will actually rematerialize the nonnull assumption. But I'm reasonably confident that doing that for !align would have terrible effects, at least for frontends that use !align a lot.

Ah that is an unfortunate! For the particular case @dtcxzyw shared, looks like the issue is early-cse: https://llvm.godbolt.org/z/KfacPbzf7

Maybe for that case it would be sufficient to have early-cse re-materialize the assumption when CSE'ing an instruction with !align metadata that cannot be preserved.

fhahn · 2024-09-17T20:28:28Z

(other alternative would be to just keep the assumptions, but I was hoping this patch would help to avoid introducing too many new instructions when doing something like #108961)

dtcxzyw · 2024-09-18T00:34:44Z

(other alternative would be to just keep the assumptions, but I was hoping this patch would help to avoid introducing too many new instructions when doing something like #108961)

Agree. Removing assumptions reduces compilation time :)

llvm/Interp.cpp.ll 36858214354 33892745069 -8.05%
llvm/Disasm.cpp.ll 3399768227 3263050730 -4.02%
llvm/APINotesReader.cpp.ll 7143171077 7021418904 -1.70%
llvm/COFFObjectFile.cpp.ll 3399105265 3356267000 -1.26%
darktable/histogram.c.ll 1007474874 995703620 -1.17%

fhahn · 2024-09-18T10:41:04Z

@dtcxzyw updated this PR to include a change to EarlyCSE to materialize alignment assumptions for !align. Could you re-run your analysis for the latest version of the PR?

dtcxzyw · 2024-09-18T11:09:25Z

@dtcxzyw updated this PR to include a change to EarlyCSE to materialize alignment assumptions for !align. Could you re-run your analysis for the latest version of the PR?

See dtcxzyw/llvm-opt-benchmark#1312. You can request a new run yourself :)
I just added this feature yesterday. Hopefully it works well.

nikic · 2024-09-18T12:29:10Z

New results show about what I'd expect. Improvements for C++ which doesn't use !align, big regressions for Rust where ~all ptr loads are !align.

fhahn · 2024-09-18T12:37:07Z

New results show about what I'd expect. Improvements for C++ which doesn't use !align, big regressions for Rust where ~all ptr loads are !align.

Yep, updating the patch to avoid adding assume() if the new pointer already has !align

If an alignment assumption is valid in the context of a corresponding load of the pointer the assumption applies to, the assumption can be replaced !align metadata on the load. The benefits of folding it into !align are that existing code makes better use of !align and it allows removing the now-redundant call instructions.

goldsteinn · 2024-09-18T15:26:49Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

+
+      // Try to fold alignment assumption into a load's !align metadata, if the
+      // assumption is valid in the load's context.
+      if (OBU.getTagName() == "align" && OBU.Inputs.size() == 2) {


Is there a reason to only use the alignment assumption and not just add support for this in computeKnownBits and use that?

Missing information about begin and end pointers of std::vector can lead to missed optimizations in LLVM. See llvm#101372 for a discussion of missed range check optimizations in hardened mode. Once llvm#108958 lands, the created `llvm.assume` calls for the alignment should be folded into the `load` instructions, resulting in no extra instructions after InstCombine.

llvm#108958

#108961) Missing information about begin and end pointers of std::vector can lead to missed optimizations in LLVM. This patch adds alignment assumptions at the point where the begin and end pointers are loaded. If the pointers would not have the same alignment, end might never get hit when incrementing begin. See #101372 for a discussion of missed range check optimizations in hardened mode. Once #108958 lands, the created `llvm.assume` calls for the alignment should be folded into the `load` instructions, resulting in no extra instructions after InstCombine. Co-authored-by: Louis Dionne <[email protected]>

…s of vector. (#108961) Missing information about begin and end pointers of std::vector can lead to missed optimizations in LLVM. This patch adds alignment assumptions at the point where the begin and end pointers are loaded. If the pointers would not have the same alignment, end might never get hit when incrementing begin. See llvm/llvm-project#101372 for a discussion of missed range check optimizations in hardened mode. Once llvm/llvm-project#108958 lands, the created `llvm.assume` calls for the alignment should be folded into the `load` instructions, resulting in no extra instructions after InstCombine. Co-authored-by: Louis Dionne <[email protected]>

llvm#108958

llvm#108961) Missing information about begin and end pointers of std::vector can lead to missed optimizations in LLVM. This patch adds alignment assumptions at the point where the begin and end pointers are loaded. If the pointers would not have the same alignment, end might never get hit when incrementing begin. See llvm#101372 for a discussion of missed range check optimizations in hardened mode. Once llvm#108958 lands, the created `llvm.assume` calls for the alignment should be folded into the `load` instructions, resulting in no extra instructions after InstCombine. Co-authored-by: Louis Dionne <[email protected]>

fhahn requested review from dtcxzyw and goldsteinn September 17, 2024 11:02

fhahn requested a review from nikic as a code owner September 17, 2024 11:02

llvmbot added the llvm:transforms label Sep 17, 2024

fhahn mentioned this pull request Sep 17, 2024

[libc++] Add assumption for align of begin and end pointers of vector. #108961

Merged

This was referenced Sep 17, 2024

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

pre-commit: PR108958 dtcxzyw/llvm-opt-benchmark#1320

Closed

dtcxzyw requested changes Sep 17, 2024

View reviewed changes

nikic reviewed Sep 17, 2024

View reviewed changes

fhahn force-pushed the ic-fold-alignment-assumption-into-load branch from 2de8174 to a7ad32d Compare September 18, 2024 10:38

dtcxzyw mentioned this pull request Sep 18, 2024

pre-commit: PR108958 dtcxzyw/llvm-opt-benchmark#1325

Closed

fhahn added 7 commits September 18, 2024 14:37

!fix formatting.

e33e425

!fixup check alignment is ConstantInt.

b050881

!fixup check align type size.

6f514d0

!fixup use getKnowledgeFromBundle

5c1a12d

!fixup update test check

415a83a

[EarlyCSE] Rematerialize alignment assumption.

ca75fc2

fhahn force-pushed the ic-fold-alignment-assumption-into-load branch from a7ad32d to ca75fc2 Compare September 18, 2024 13:38

dtcxzyw mentioned this pull request Sep 18, 2024

pre-commit: PR108958 dtcxzyw/llvm-opt-benchmark#1330

Closed

goldsteinn reviewed Sep 18, 2024

View reviewed changes

fhahn mentioned this pull request Sep 20, 2024

[EarlyCSE] Rematerialize alignment assumption. #109131

Draft

fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 15, 2025

[InstCombine] Fold align assume into load's !align metadata if possible

8239838

llvm#108958

fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 15, 2025

[InstCombine] Fold align assume into load's !align metadata if possible

4125a04

llvm#108958

fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 16, 2025

[InstCombine] Fold align assume into load's !align metadata if possible

107a57d

llvm#108958

fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 16, 2025

[InstCombine] Fold align assume into load's !align metadata if possible

1e2b44f

llvm#108958

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Fold align assume into load's !align metadata if possible. #108958

[InstCombine] Fold align assume into load's !align metadata if possible. #108958

fhahn commented Sep 17, 2024

llvmbot commented Sep 17, 2024

github-actions bot commented Sep 17, 2024 •

edited

Loading

dtcxzyw Sep 17, 2024

dtcxzyw Sep 17, 2024

fhahn Sep 17, 2024

fhahn Sep 17, 2024

fhahn Sep 17, 2024

nikic commented Sep 17, 2024

nikic Sep 17, 2024

dtcxzyw commented Sep 17, 2024

fhahn commented Sep 17, 2024

fhahn commented Sep 17, 2024

dtcxzyw commented Sep 18, 2024

fhahn commented Sep 18, 2024

dtcxzyw commented Sep 18, 2024

nikic commented Sep 18, 2024

fhahn commented Sep 18, 2024

goldsteinn Sep 18, 2024

		auto *Align = cast<ConstantInt>(OBU.Inputs[1]);
		if (!isPowerOf2_64(Align->getZExtValue()))

[InstCombine] Fold align assume into load's !align metadata if possible. #108958

Are you sure you want to change the base?

[InstCombine] Fold align assume into load's !align metadata if possible. #108958

Conversation

fhahn commented Sep 17, 2024

llvmbot commented Sep 17, 2024

github-actions bot commented Sep 17, 2024 • edited Loading

dtcxzyw Sep 17, 2024

Choose a reason for hiding this comment

dtcxzyw Sep 17, 2024

Choose a reason for hiding this comment

fhahn Sep 17, 2024

Choose a reason for hiding this comment

fhahn Sep 17, 2024

Choose a reason for hiding this comment

fhahn Sep 17, 2024

Choose a reason for hiding this comment

nikic commented Sep 17, 2024

nikic Sep 17, 2024

Choose a reason for hiding this comment

dtcxzyw commented Sep 17, 2024

fhahn commented Sep 17, 2024

fhahn commented Sep 17, 2024

dtcxzyw commented Sep 18, 2024

fhahn commented Sep 18, 2024

dtcxzyw commented Sep 18, 2024

nikic commented Sep 18, 2024

fhahn commented Sep 18, 2024

goldsteinn Sep 18, 2024

Choose a reason for hiding this comment

github-actions bot commented Sep 17, 2024 •

edited

Loading