Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueLattice] Reset the count of range extensions when merging an undef with a constant range. #77307

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Jan 8, 2024

This patch resets NumRangeExtensions to zero when merging an undef with a constant range.
This change also matches the behavior of markConstantRange:

assert(isUnknown() || isUndef());
NumRangeExtensions = 0;
Tag = NewTag;
new (&Range) ConstantRange(std::move(NewR));
return true;

Fixes a regression caused by #76295.
See also dtcxzyw/llvm-opt-benchmark#36 (comment).

Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=0e4a38018a7228d93d72a31d9fae6855f866dded&to=b599e2e59104ef7c3d230d9f7b7ebf7fed56d48d&stat=instructions%3Au

stage1-O3 stage1-ReleaseThinLTO stage1-ReleaseLTO-g stage1-O0-g stage2-O3 stage2-O0-g stage2-clang
-0.01% +0.01% +0.02% +0.03% +0.02% -0.03% -0.01%

@llvmbot
Copy link
Member

llvmbot commented Jan 8, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch resets NumRangeExtensions to zero when merging an undef with a constant range.
This change also matches the behavior of markConstantRange:

assert(isUnknown() || isUndef());
NumRangeExtensions = 0;
Tag = NewTag;
new (&Range) ConstantRange(std::move(NewR));
return true;

Fixes a regression caused by #76295.
See also dtcxzyw/llvm-opt-benchmark#36 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/77307.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueLattice.h (+1)
  • (added) llvm/test/Transforms/SCCP/widen-steps-limit.ll (+91)
diff --git a/llvm/include/llvm/Analysis/ValueLattice.h b/llvm/include/llvm/Analysis/ValueLattice.h
index 2898cdd3d7b0ca..196d7d11e13da4 100644
--- a/llvm/include/llvm/Analysis/ValueLattice.h
+++ b/llvm/include/llvm/Analysis/ValueLattice.h
@@ -406,6 +406,7 @@ class ValueLatticeElement {
     if (isUnknown()) {
       assert(!RHS.isUnknown() && "Unknow RHS should be handled earlier");
       *this = RHS;
+      NumRangeExtensions = 0;
       return true;
     }
 
diff --git a/llvm/test/Transforms/SCCP/widen-steps-limit.ll b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
new file mode 100644
index 00000000000000..98224c4d495705
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
@@ -0,0 +1,91 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=sccp < %s | FileCheck %s
+
+define void @test() {
+; CHECK-LABEL: define void @test() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = tail call i8 @geti8()
+; CHECK-NEXT:    [[CONV_I:%.*]] = zext i8 [[TMP0]] to i32
+; CHECK-NEXT:    [[CMP19_I_NOT:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP19_I_NOT]], label [[WHILE_END_I_THREAD:%.*]], label [[EXIT2:%.*]]
+; CHECK:       while.end.i.thread:
+; CHECK-NEXT:    [[TMP1:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
+; CHECK-NEXT:    br label [[EXIT2]]
+; CHECK:       exit2:
+; CHECK-NEXT:    [[CMP_I12_I4:%.*]] = phi i32 [ [[TMP2]], [[WHILE_END_I_THREAD]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[CMP:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN6:%.*]], label [[IF_END_I_I:%.*]]
+; CHECK:       if.then6:
+; CHECK-NEXT:    [[CMP_I32:%.*]] = icmp ugt i32 [[CMP_I12_I4]], [[CONV_I]]
+; CHECK-NEXT:    br i1 [[CMP_I32]], label [[EXIT:%.*]], label [[COND_FALSE_I:%.*]]
+; CHECK:       cond.false.i:
+; CHECK-NEXT:    [[CMP_NOT6_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP_NOT6_I_I]], label [[EXIT]], label [[FOR_BODY_PREHEADER_I_I:%.*]]
+; CHECK:       for.body.preheader.i.i:
+; CHECK-NEXT:    [[CMP_NOT_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br label [[EXIT]]
+; CHECK:       exit:
+; CHECK-NEXT:    [[RETVAL_SROA_3_0_I:%.*]] = phi i32 [ 0, [[IF_THEN6]] ], [ 1, [[COND_FALSE_I]] ], [ 1, [[FOR_BODY_PREHEADER_I_I]] ]
+; CHECK-NEXT:    br i1 [[CMP19_I_NOT]], label [[IF_END_I_I]], label [[DO_BODY_I_I:%.*]]
+; CHECK:       do.body.i.i:
+; CHECK-NEXT:    br label [[IF_END_I_I]]
+; CHECK:       if.end.i.i:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %0 = tail call i8 @geti8()
+  %conv.i = zext i8 %0 to i32
+  %cmp19.i.not = tail call i1 @geti1()
+  br i1 %cmp19.i.not, label %while.end.i.thread, label %exit2
+
+while.end.i.thread:                               ; preds = %entry
+  %1 = tail call i1 @geti1()
+  %2 = zext i1 %1 to i32
+  br label %exit2
+
+exit2:                                            ; preds = %while.end.i.thread, %entry
+  %cmp.i12.i4 = phi i32 [ %2, %while.end.i.thread ], [ 0, %entry ]
+  %cmp = tail call i1 @geti1()
+  br i1 %cmp, label %if.then6, label %if.end.i.i
+
+if.then6:                                         ; preds = %exit2
+  %cmp.i32 = icmp ugt i32 %cmp.i12.i4, %conv.i
+  br i1 %cmp.i32, label %exit, label %cond.false.i
+
+cond.false.i:                                     ; preds = %if.then6
+  %cmp.not6.i.i = tail call i1 @geti1()
+  br i1 %cmp.not6.i.i, label %exit, label %for.body.preheader.i.i
+
+for.body.preheader.i.i:                           ; preds = %cond.false.i
+  %cmp.not.i.i = tail call i1 @geti1()
+  br label %exit
+
+exit:                                             ; preds = %for.body.preheader.i.i, %cond.false.i, %if.then6
+  %retval.sroa.3.0.i = phi i32 [ 0, %if.then6 ], [ 1, %cond.false.i ], [ 1, %for.body.preheader.i.i ]
+  br i1 %cmp19.i.not, label %if.end.i.i, label %do.body.i.i
+
+do.body.i.i:                                      ; preds = %exit, %do.cond.i.i
+  %result.sroa.7.0.i.i = phi i32 [ %result.sroa.7.1.i.i, %do.cond.i.i ], [ %retval.sroa.3.0.i, %exit ]
+  switch i32 %result.sroa.7.0.i.i, label %do.cond.i.i [
+  i32 2, label %sw.bb.i.i
+  i32 1, label %if.end.i.i
+  i32 0, label %if.end.i.i
+  ]
+
+sw.bb.i.i:                                        ; preds = %do.body.i.i
+  %3 = tail call i32 @geti32()
+  br label %do.cond.i.i
+
+do.cond.i.i:                                      ; preds = %sw.bb.i.i, %do.body.i.i
+  %result.sroa.7.1.i.i = phi i32 [ %result.sroa.7.0.i.i, %do.body.i.i ], [ %3, %sw.bb.i.i ]
+  %cmp22.i.i = tail call i1 @geti1()
+  br i1 %cmp22.i.i, label %do.body.i.i, label %if.end.i.i
+
+if.end.i.i:                                       ; preds = %do.cond.i.i, %do.body.i.i, %do.body.i.i, %exit, %exit2
+  ret void
+}
+
+declare i8 @geti8()
+declare i32 @geti32()
+declare i1 @geti1()

@llvmbot
Copy link
Member

llvmbot commented Jan 8, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch resets NumRangeExtensions to zero when merging an undef with a constant range.
This change also matches the behavior of markConstantRange:

assert(isUnknown() || isUndef());
NumRangeExtensions = 0;
Tag = NewTag;
new (&Range) ConstantRange(std::move(NewR));
return true;

Fixes a regression caused by #76295.
See also dtcxzyw/llvm-opt-benchmark#36 (comment).


Full diff: https://github.com/llvm/llvm-project/pull/77307.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueLattice.h (+1)
  • (added) llvm/test/Transforms/SCCP/widen-steps-limit.ll (+91)
diff --git a/llvm/include/llvm/Analysis/ValueLattice.h b/llvm/include/llvm/Analysis/ValueLattice.h
index 2898cdd3d7b0ca..196d7d11e13da4 100644
--- a/llvm/include/llvm/Analysis/ValueLattice.h
+++ b/llvm/include/llvm/Analysis/ValueLattice.h
@@ -406,6 +406,7 @@ class ValueLatticeElement {
     if (isUnknown()) {
       assert(!RHS.isUnknown() && "Unknow RHS should be handled earlier");
       *this = RHS;
+      NumRangeExtensions = 0;
       return true;
     }
 
diff --git a/llvm/test/Transforms/SCCP/widen-steps-limit.ll b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
new file mode 100644
index 00000000000000..98224c4d495705
--- /dev/null
+++ b/llvm/test/Transforms/SCCP/widen-steps-limit.ll
@@ -0,0 +1,91 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes=sccp < %s | FileCheck %s
+
+define void @test() {
+; CHECK-LABEL: define void @test() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TMP0:%.*]] = tail call i8 @geti8()
+; CHECK-NEXT:    [[CONV_I:%.*]] = zext i8 [[TMP0]] to i32
+; CHECK-NEXT:    [[CMP19_I_NOT:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP19_I_NOT]], label [[WHILE_END_I_THREAD:%.*]], label [[EXIT2:%.*]]
+; CHECK:       while.end.i.thread:
+; CHECK-NEXT:    [[TMP1:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
+; CHECK-NEXT:    br label [[EXIT2]]
+; CHECK:       exit2:
+; CHECK-NEXT:    [[CMP_I12_I4:%.*]] = phi i32 [ [[TMP2]], [[WHILE_END_I_THREAD]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-NEXT:    [[CMP:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN6:%.*]], label [[IF_END_I_I:%.*]]
+; CHECK:       if.then6:
+; CHECK-NEXT:    [[CMP_I32:%.*]] = icmp ugt i32 [[CMP_I12_I4]], [[CONV_I]]
+; CHECK-NEXT:    br i1 [[CMP_I32]], label [[EXIT:%.*]], label [[COND_FALSE_I:%.*]]
+; CHECK:       cond.false.i:
+; CHECK-NEXT:    [[CMP_NOT6_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br i1 [[CMP_NOT6_I_I]], label [[EXIT]], label [[FOR_BODY_PREHEADER_I_I:%.*]]
+; CHECK:       for.body.preheader.i.i:
+; CHECK-NEXT:    [[CMP_NOT_I_I:%.*]] = tail call i1 @geti1()
+; CHECK-NEXT:    br label [[EXIT]]
+; CHECK:       exit:
+; CHECK-NEXT:    [[RETVAL_SROA_3_0_I:%.*]] = phi i32 [ 0, [[IF_THEN6]] ], [ 1, [[COND_FALSE_I]] ], [ 1, [[FOR_BODY_PREHEADER_I_I]] ]
+; CHECK-NEXT:    br i1 [[CMP19_I_NOT]], label [[IF_END_I_I]], label [[DO_BODY_I_I:%.*]]
+; CHECK:       do.body.i.i:
+; CHECK-NEXT:    br label [[IF_END_I_I]]
+; CHECK:       if.end.i.i:
+; CHECK-NEXT:    ret void
+;
+entry:
+  %0 = tail call i8 @geti8()
+  %conv.i = zext i8 %0 to i32
+  %cmp19.i.not = tail call i1 @geti1()
+  br i1 %cmp19.i.not, label %while.end.i.thread, label %exit2
+
+while.end.i.thread:                               ; preds = %entry
+  %1 = tail call i1 @geti1()
+  %2 = zext i1 %1 to i32
+  br label %exit2
+
+exit2:                                            ; preds = %while.end.i.thread, %entry
+  %cmp.i12.i4 = phi i32 [ %2, %while.end.i.thread ], [ 0, %entry ]
+  %cmp = tail call i1 @geti1()
+  br i1 %cmp, label %if.then6, label %if.end.i.i
+
+if.then6:                                         ; preds = %exit2
+  %cmp.i32 = icmp ugt i32 %cmp.i12.i4, %conv.i
+  br i1 %cmp.i32, label %exit, label %cond.false.i
+
+cond.false.i:                                     ; preds = %if.then6
+  %cmp.not6.i.i = tail call i1 @geti1()
+  br i1 %cmp.not6.i.i, label %exit, label %for.body.preheader.i.i
+
+for.body.preheader.i.i:                           ; preds = %cond.false.i
+  %cmp.not.i.i = tail call i1 @geti1()
+  br label %exit
+
+exit:                                             ; preds = %for.body.preheader.i.i, %cond.false.i, %if.then6
+  %retval.sroa.3.0.i = phi i32 [ 0, %if.then6 ], [ 1, %cond.false.i ], [ 1, %for.body.preheader.i.i ]
+  br i1 %cmp19.i.not, label %if.end.i.i, label %do.body.i.i
+
+do.body.i.i:                                      ; preds = %exit, %do.cond.i.i
+  %result.sroa.7.0.i.i = phi i32 [ %result.sroa.7.1.i.i, %do.cond.i.i ], [ %retval.sroa.3.0.i, %exit ]
+  switch i32 %result.sroa.7.0.i.i, label %do.cond.i.i [
+  i32 2, label %sw.bb.i.i
+  i32 1, label %if.end.i.i
+  i32 0, label %if.end.i.i
+  ]
+
+sw.bb.i.i:                                        ; preds = %do.body.i.i
+  %3 = tail call i32 @geti32()
+  br label %do.cond.i.i
+
+do.cond.i.i:                                      ; preds = %sw.bb.i.i, %do.body.i.i
+  %result.sroa.7.1.i.i = phi i32 [ %result.sroa.7.0.i.i, %do.body.i.i ], [ %3, %sw.bb.i.i ]
+  %cmp22.i.i = tail call i1 @geti1()
+  br i1 %cmp22.i.i, label %do.body.i.i, label %if.end.i.i
+
+if.end.i.i:                                       ; preds = %do.cond.i.i, %do.body.i.i, %do.body.i.i, %exit, %exit2
+  ret void
+}
+
+declare i8 @geti8()
+declare i32 @geti32()
+declare i1 @geti1()

@dtcxzyw dtcxzyw linked an issue Jan 8, 2024 that may be closed by this pull request
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jan 8, 2024
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jan 21, 2024

Ping.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really clear to me that this change is right. How NumRangeExtensions should behave when merging lattice values isn't clear to me. The proposed behavior is asymmetric -- but maybe that's okay?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jan 27, 2024

@fhahn Any comments?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jan 31, 2024

Ping @fhahn

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So previously when merging in the first other lattice value, it would re-use that lattice value's NumRangeExtension, whereas now it gets set to 0. This should guarantee that we still converge as it's only set when merging with Unknown, but possibly slower.

In particular consumer-typeset's compile-time seems to noticeably increase. Do you have any insight why that is and if we could do better? It looks like there's also a notable size change for consumer-typeset, do you know if those are an improvement?

; CHECK: if.end.i.i:
; CHECK-NEXT: ret void
;
entry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reduce the test a bit further?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update diff January 8th 2024, 1:11:37 pm
4 participants