Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if we always run ReorderBasicBlocks? #139162

Closed
wants to merge 1 commit into from

Conversation

scottmcm
Copy link
Member

Obviously we don't need to run this, but I keep being sad when I look at MIR on the playground that it's really awkward to read because you have to jump around all over because whenever something's inlined it's put at the end.

So what if we just do it when we're optimizing anyway? Can we afford it?

@rustbot
Copy link
Collaborator

rustbot commented Mar 31, 2025

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 31, 2025
@scottmcm
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 31, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2025
What if we always run `ReorderBasicBlocks`?

Obviously we don't *need* to run this, but I keep being sad when I look at MIR on the playground that it's really awkward to read because you have to jump around all over because whenever something's inlined it's put at the end.

So what if we just do it when we're optimizing anyway?  Can we afford it?
@bors
Copy link
Collaborator

bors commented Mar 31, 2025

⌛ Trying commit 701d9fa with merge c180af9...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Mar 31, 2025

☀️ Try build successful - checks-actions
Build commit: c180af9 (c180af9fa83e619e3e7b97a8145403557f8e9c0e)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c180af9): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.4% [0.1%, 0.5%] 16
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.1%, 0.5%] 16

Max RSS (memory usage)

Results (primary -1.7%, secondary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [2.4%, 2.4%] 1
Improvements ✅
(primary)
-1.7% [-1.7%, -1.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.7% [-1.7%, -1.7%] 1

Cycles

Results (secondary 2.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.1%, 2.4%] 2
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.0%, secondary 0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 8
Regressions ❌
(secondary)
0.5% [0.2%, 0.8%] 6
Improvements ✅
(primary)
-0.0% [-0.1%, -0.0%] 10
Improvements ✅
(secondary)
-0.0% [-0.1%, -0.0%] 4
All ❌✅ (primary) -0.0% [-0.1%, 0.0%] 18

Bootstrap: 776.099s -> 775.714s (-0.05%)
Artifact size: 365.92 MiB -> 365.95 MiB (0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 31, 2025
@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2025
}

fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
let rpo: IndexVec<BasicBlock, BasicBlock> =
body.basic_blocks.reverse_postorder().iter().copied().collect();
let rpo = IndexSlice::from_raw(body.basic_blocks.reverse_postorder());
if rpo.iter().is_sorted() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "is_sorted" method also exists on slices which might be faster

@scottmcm
Copy link
Member Author

scottmcm commented Apr 1, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 1, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 1, 2025
What if we always run `ReorderBasicBlocks`?

Obviously we don't *need* to run this, but I keep being sad when I look at MIR on the playground that it's really awkward to read because you have to jump around all over because whenever something's inlined it's put at the end.

So what if we just do it when we're optimizing anyway?  Can we afford it?
@bors
Copy link
Collaborator

bors commented Apr 1, 2025

⌛ Trying commit 2bf23e3 with merge f4b899e...

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-18 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
#19 exporting to docker image format
#19 sending tarball 19.6s done
#19 DONE 70.3s
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-18]
[CI_JOB_NAME=x86_64-gnu-llvm-18]
debug: `DISABLE_CI_RUSTC_IF_INCOMPATIBLE` configured.
---
sccache: Listening on address 127.0.0.1:4226
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-18', '--enable-llvm-link-shared', '--set', 'rust.randomize-layout=true', '--set', 'rust.thin-lto-import-instr-limit=10', '--set', 'build.print-step-timings', '--enable-verbose-tests', '--set', 'build.metrics', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'rust.lld=false', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: build.build          := x86_64-unknown-linux-gnu
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-18/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.randomize-layout := True
configure: rust.thin-lto-import-instr-limit := 10
---
  Number of decisions:   4447
  longest path:          1159 (code:    152)
  longest backtrack:       66 (code:    428)
Shared 86733 out of 152951 states by creating 14756 new states, saving 71977
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/expmed.cc: In function ‘rtx_def* extract_bit_field_1(rtx, poly_uint64, poly_uint64, int, rtx, machine_mode, machine_mode, bool, bool, rtx_def**)’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/expmed.cc:1864:45: warning: ‘*(unsigned int*)((char*)&imode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ may be used uninitialized [-Wmaybe-uninitialized]
 1864 |       rtx sub = extract_bit_field_as_subreg (mode1, op0, imode,
      |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
 1865 |                                              bitsize, bitnum);
      |                                              ~~~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/expmed.cc:1824:19: note: ‘*(unsigned int*)((char*)&imode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))’ was declared here
 1824 |   scalar_int_mode imode;
      |                   ^~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/gimple-range-gori.cc: In member function ‘void range_def_chain::dump(FILE*, basic_block, const char*)’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/gimple-range-gori.cc:319:19: warning: format not a string literal and no format arguments [-Wformat-security]
  319 |           fprintf (f, prefix);
      |           ~~~~~~~~^~~~~~~~~~~
---
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/gcc.cc:7930:9: warning: ignoring return value of ‘ssize_t write(int, const void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 7930 |   write (fd, "\n\n", 2);
      |   ~~~~~~^~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/gcc.cc: In member function ‘void driver::final_actions() const’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/gcc.cc:9307:13: warning: ignoring return value of ‘int truncate(const char*, __off_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 9307 |     truncate(totruncate_file, 0);
      |     ~~~~~~~~^~~~~~~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto-wrapper.cc: In function ‘bool find_and_merge_options(int, off_t, const char*, vec<cl_decoded_option>, bool, vec<cl_decoded_option>*, const char*)’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto-wrapper.cc:1165:8: warning: ignoring return value of ‘ssize_t read(int, void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 1165 |   read (fd, data, length);
      |   ~~~~~^~~~~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto/lto-common.cc: In function ‘void lto_resolution_read(splay_tree, FILE*, lto_file*)’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto/lto-common.cc:2091:10: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 2091 |   fscanf (resolution, " ");   /* Read white space.  */
      |   ~~~~~~~^~~~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto/lto-common.cc:2093:9: warning: ignoring return value of ‘size_t fread(void*, size_t, size_t, FILE*)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 2093 |   fread (obj_name, sizeof (char), name_len, resolution);
      |   ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/lto/lto-common.cc:2113:10: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
 2113 |   fscanf (resolution, "%u", &num_symbols);
      |   ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/gcc/jit/jit-recording.cc:32:
---
Applying io_quotes_use            to linux/blkzoned.h
Applying io_quotes_use            to linux/ipmi.h
Applying io_quotes_use            to linux/psp-dbc.h
Applying io_quotes_use            to linux/bt-bmc.h
Applying io_quotes_use            to linux/tps6594_pfsm.h
Applying io_quotes_use            to linux/cxl_mem.h
Applying io_quotes_use            to linux/wmi.h
Applying io_quotes_use            to linux/auto_fs.h
Applying io_quotes_use            to linux/mmtimer.h
Applying io_quotes_use            to linux/f2fs.h
Applying io_quotes_use            to linux/vhost.h
---
Applying machine_name             to x86_64-linux-gnu/bits/unistd_ext.h
Applying io_quotes_use            to x86_64-linux-gnu/asm/mtrr.h
Applying io_quotes_use            to x86_64-linux-gnu/asm/amd_hsmp.h
Applying machine_name             to openssl/e_os2.h
Applying io_quotes_use            to drm/xe_drm.h
Applying io_quotes_use            to drm/radeon_drm.h
Applying io_quotes_use            to drm/panfrost_drm.h
Applying io_quotes_use            to drm/etnaviv_drm.h
Applying io_quotes_use            to drm/lima_drm.h
Applying io_quotes_use            to drm/qaic_accel.h
Applying io_quotes_use            to drm/vc4_drm.h
Applying io_quotes_use            to drm/i915_drm.h
Applying io_quotes_use            to drm/omap_drm.h
Applying io_quotes_use            to drm/pvr_drm.h
Applying io_quotes_use            to drm/amdgpu_drm.h
Applying io_quotes_use            to drm/vgem_drm.h
Applying io_quotes_use            to drm/msm_drm.h
Applying io_quotes_use            to drm/v3d_drm.h
Applying io_quotes_use            to drm/exynos_drm.h
Applying io_quotes_use            to drm/nouveau_drm.h
Applying io_quotes_use            to drm/drm.h
Applying io_quotes_use            to drm/habanalabs_accel.h
Applying io_quotes_use            to drm/tegra_drm.h
Applying io_quotes_use            to rdma/rdma_user_ioctl.h
cc1: note: self-tests are not enabled in this build
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/c++tools/server.cc: In function ‘void server(bool, int, module_resolver*)’:
/checkout/obj/build/x86_64-unknown-linux-gnu/gcc/src/c++tools/server.cc:620:10: warning: ignoring return value of ‘int pipe(int*)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
---
---- [ui] tests/ui/stable-mir-print/basic_function.rs stdout ----
Saved the actual stdout to "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/stable-mir-print/basic_function/basic_function.stdout"
diff of stdout:

44     let mut _0: u8;
45     debug input => _1;
46     bb0: {
-         switchInt(_1) -> [0: bb4, 1: bb3, 2: bb2, otherwise: bb1];
+         switchInt(_1) -> [0: bb1, 1: bb2, 2: bb3, otherwise: bb4];
48     }
49     bb1: {
-         _0 = 0_u8;
+         _0 = 10_u8;
51         goto -> bb5;
---
To only update this specific test, also pass `--test-args stable-mir-print/basic_function.rs`

error: 1 errors occurred comparing output.
status: exit status: 0
command: env -u RUSTC_LOG_COLOR RUSTC_ICE="0" RUST_BACKTRACE="short" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/tests/ui/stable-mir-print/basic_function.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=/cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=/checkout/vendor" "--sysroot" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2" "--target=x86_64-unknown-linux-gnu" "--check-cfg" "cfg(test,FALSE)" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/stable-mir-print/basic_function" "-A" "unused" "-A" "internal_features" "-Crpath" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-Z" "unpretty=stable-mir" "-Z" "mir-opt-level=3"
--- stdout -------------------------------
// WARNING: This is highly experimental output it's intended for stable-mir developers only.
// If you find a bug or want to improve the output open a issue at https://github.com/rust-lang/project-stable-mir.
fn foo(_1: i32) -> i32 {
    let mut _0: i32;
    let mut _2: (i32, bool);
    debug i => _1;
    bb0: {
        _2 = CheckedAdd(_1, 1_i32);
        assert(!move (_2.1: bool), "attempt to compute `{} + {}`, which would overflow", _1, 1_i32) -> [success: bb1, unwind continue];
    }
    bb1: {
        _0 = move (_2.0: i32);
        return;
    }
}
fn bar(_1: &mut Vec<i32>) -> Vec<i32> {
    let mut _0: Vec<i32>;
    let mut _2: Vec<i32>;
    let mut _3: &Vec<i32>;
    let  _4: ();
    let mut _5: &mut Vec<i32>;
    debug vec => _1;
    debug new_vec => _2;
    bb0: {
        _3 = &(*_1);
        _2 = <Vec<i32> as Clone>::clone(move _3) -> [return: bb1, unwind continue];
    }
    bb1: {
        _5 = &mut _2;
        _4 = Vec::<i32>::push(move _5, 1_i32) -> [return: bb2, unwind: bb3];
    }
    bb2: {
        _0 = move _2;
        return;
    }
    bb3: {
        drop(_2) -> [return: bb4, unwind terminate];
    }
    bb4: {
        resume;
    }
}
fn demux(_1: u8) -> u8 {
    let mut _0: u8;
    debug input => _1;
    bb0: {
        switchInt(_1) -> [0: bb1, 1: bb2, 2: bb3, otherwise: bb4];
    }
    bb1: {
        _0 = 10_u8;
        goto -> bb5;
    }

@bors
Copy link
Collaborator

bors commented Apr 1, 2025

☀️ Try build successful - checks-actions
Build commit: f4b899e (f4b899e97e01615ccb215b023ded28012bffbd5e)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f4b899e): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.4% [0.1%, 1.0%] 15
Regressions ❌
(secondary)
0.3% [0.1%, 0.8%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.1%, 1.0%] 15

Max RSS (memory usage)

Results (secondary -2.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.6% [-2.6%, -2.6%] 1
All ❌✅ (primary) - - 0

Cycles

Results (primary 0.7%, secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.5%, 0.7%] 3
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [0.5%, 0.7%] 3

Binary size

Results (primary -0.0%, secondary 0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.0% [0.0%, 0.0%] 8
Regressions ❌
(secondary)
0.5% [0.2%, 0.8%] 6
Improvements ✅
(primary)
-0.1% [-0.1%, -0.0%] 10
Improvements ✅
(secondary)
-0.0% [-0.1%, -0.0%] 4
All ❌✅ (primary) -0.0% [-0.1%, 0.0%] 18

Bootstrap: 773.955s -> 774.03s (0.01%)
Artifact size: 365.96 MiB -> 366.01 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 1, 2025
@scottmcm
Copy link
Member Author

scottmcm commented Apr 1, 2025

Yeah, I can't really fight for that perf hit when it's just to make stuff pretty.

@scottmcm scottmcm closed this Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants