Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New MIR opt pass simplify_pow_of_two #114254

Closed
wants to merge 5 commits into from

Conversation

Centri3
Copy link
Member

@Centri3 Centri3 commented Jul 30, 2023

This detects calls to x.pow where x is a power of two and an integer. This can use shl instead, with some nuances. Unfortunately modifying pow instead results in this check being done at runtime, and also getting the power used, thus is even slower than without it.

Supersedes the clippy lint rust-lang/rust-clippy#11057

I haven't benchmarked this yet but from my testing, without debug assertions this has 0 branches, thus is branchless, and only has a couple instructions, so I'm 99% confident this is faster. With debug assertions, I'm not sure, custom_mir doesn't really like assertions 😅

For calls where x is not 2, the rhs for shl can be exp * <power used to get x>.

Hope I did everything right ^^

@rustbot
Copy link
Collaborator

rustbot commented Jul 30, 2023

r? @petrochenkov

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 30, 2023
@rustbot
Copy link
Collaborator

rustbot commented Jul 30, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rust-cloud-vms rust-cloud-vms bot force-pushed the simplify-pow-of-two branch from 0dd63be to 100171c Compare July 30, 2023 14:35
@rust-log-analyzer

This comment was marked as outdated.

@rust-log-analyzer

This comment has been minimized.

@SkiFire13
Copy link
Contributor

Unfortunately modifying pow instead results in this check being done at runtime

This isn't the first time I feel like we need some sort of "if this condition is always true after optimizations always pick this branch, otherwise discard it completly"

@rust-cloud-vms rust-cloud-vms bot force-pushed the simplify-pow-of-two branch from 939f2bf to 6a490b0 Compare July 30, 2023 15:22
@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member

Is the goal of this MIR transform to improve final codegen? If it is, we should have something under tests/codegen/ that tests for the improved codegen. Also if it is to improve post-LLVM codegen, why wouldn't we have LLVM handle this?

@Centri3
Copy link
Member Author

Centri3 commented Jul 30, 2023

There's no reason it can't be, but there are many ways to implement pow on an integer. It'd be specialized to the point where it probably just makes most sense to be in rustc

I don't know much about LLVM though. Maybe it already optimizes its own pow intrinsic like this already, I'm not sure. If it does we can switch the implementation of pow to just use that, alongside a new pow intrinsic

@petrochenkov
Copy link
Contributor

r? @Compiler

@rustbot rustbot assigned jackh726 and unassigned petrochenkov Jul 30, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-cloud-vms rust-cloud-vms bot force-pushed the simplify-pow-of-two branch from 6db3f53 to bfa1165 Compare July 30, 2023 17:29
@rust-log-analyzer

This comment has been minimized.

&& let Some(def_id) = func.const_fn_def().map(|def| def.0)
&& let def_path = tcx.def_path(def_id)
&& tcx.crate_name(def_path.krate) == sym::core
// FIXME(Centri3): I feel like we should do this differently...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps check that it's an associated method of an integer primitive type?

Using names is usually the wrong way to identify things.

(Arguably maybe it should be a lang item if it's recognized semantically, but that would probably need one lang item per type, which would be a mess...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lang items would make sense, though yeah, there'd be tons of them... I tried with diagnostic items before and it was a bit of pain due to that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstSimplify looks at the declaring type for one of its things, so maybe you can use something like that?

// Only bother looking more if it's easy to know what we're calling
let Some((fn_def_id, fn_args)) = func.const_fn_def() else { return };
// Clone needs one subst, so we can cheaply rule out other stuff
if fn_args.len() != 1 {
return;
}
// These types are easily available from locals, so check that before
// doing DefId lookups to figure out what we're actually calling.
let arg_ty = args[0].ty(self.local_decls, self.tcx);
let ty::Ref(_region, inner_ty, Mutability::Not) = *arg_ty.kind() else { return };
if !inner_ty.is_trivially_pure_clone_copy() {
return;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still need to make sure the method name is pow, though that should be a lot better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does lang require it to not be const?

No, but we could have a single generic lang item that uses trait bounds for Add, Mul and such. Then all the integer impls could call the generic lang item, and we could make our optimization work on the generic lang item. Though that would still have to happen post-inlining

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM has an llvm.is.constant intrinsic, which basically does what you want is_known_to_be_const to do. If you export that intrinsic and add a code path for handling power-of-two guarded by it to the pow implementation, that should get optimized fine on the LLVM level.

Copy link
Member Author

@Centri3 Centri3 Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would still get the power used and all at runtime though, afaict, as you can't use a non-constant in a const block (which would be self here, which as far as rust's concerned isn't a constant despite it being so). I believe this requires at least a pow call anyway so if it does it at runtime it'll be slower regardless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the trick is that if you add a

if llvm_is_constant(self) {
    if self.is_power_of_two() {
         ...
    }
}

then, since LLVM knows self is a constant, it will be able to optimize self.is_power_of_two() into true or false, and then either optimize out your logic or the rest of pow depending on the result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good news: I did that and the generated code is even better with it, and it 100% works! Will open a separate PR for that and close this one (though not rn as the intrinsic will cause rustc to segfault if called on non-i32s, will need some more work 🙃)

@rust-cloud-vms rust-cloud-vms bot force-pushed the simplify-pow-of-two branch from bfa1165 to 9b29907 Compare July 30, 2023 19:13
@rust-log-analyzer

This comment has been minimized.

@cjgillot cjgillot self-assigned this Jul 30, 2023
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-15 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
##[group]Run git config --global core.autocrlf false
git config --global core.autocrlf false
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
  lfs: false
  submodules: false
  set-safe-directory: true
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/setup-environment.sh
src/ci/scripts/setup-environment.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/should-skip-this.sh
src/ci/scripts/should-skip-this.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/verify-channel.sh
src/ci/scripts/verify-channel.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/collect-cpu-stats.sh
src/ci/scripts/collect-cpu-stats.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/install-sccache.sh
src/ci/scripts/install-sccache.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/select-xcode.sh
src/ci/scripts/select-xcode.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/install-clang.sh
src/ci/scripts/install-clang.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/install-wix.sh
src/ci/scripts/install-wix.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/disable-git-crlf-conversion.sh
src/ci/scripts/disable-git-crlf-conversion.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/install-msys2.sh
src/ci/scripts/install-msys2.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/install-mingw.sh
src/ci/scripts/install-mingw.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/install-ninja.sh
src/ci/scripts/install-ninja.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/enable-docker-ipv6.sh
src/ci/scripts/enable-docker-ipv6.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/disable-git-crlf-conversion.sh
src/ci/scripts/disable-git-crlf-conversion.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/verify-line-endings.sh
src/ci/scripts/verify-line-endings.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
  CACHE_DOMAIN: ci-caches.rust-lang.org
  IMAGE: x86_64-gnu-llvm-15
##[endgroup]
##[group]Run src/ci/scripts/verify-backported-commits.sh
src/ci/scripts/verify-backported-commits.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/verify-stable-version-number.sh
src/ci/scripts/verify-stable-version-number.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
##[group]Run src/ci/scripts/run-build-from-ci.sh
src/ci/scripts/run-build-from-ci.sh
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  PR_CI_JOB: 1
  CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
  HEAD_SHA: f944ebadabdb2145ed436edcf4f15a392ce64d90
  SCCACHE_BUCKET: rust-lang-ci-sccache2
  TOOLSTATE_REPO: https://github.com/rust-lang-nursery/rust-toolstate
---
i................i.i....ii...........i....i...i.................iii............i.i......  88/496
......i......................ii.................i...............i.....i................. 176/496
.iiii.ii.i.....i..iiiii.......i.i.i....i.i......iiii........ii....i.ii......i..i........ 264/496
................i...ii..i.i.....ii...iii..............ii.i............................ii 352/496
........................ii...ii.iiiiii.i.....F.....F....i....i....i..iii...........i.... 440/496

failures:

---- [codegen] tests/codegen/simplify-pow-of-two.rs stdout ----
---- [codegen] tests/codegen/simplify-pow-of-two.rs stdout ----

error: verification with 'FileCheck' failed
Build completed unsuccessfully in 0:11:11
Build completed unsuccessfully in 0:11:11
command: "/usr/lib/llvm-15/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two/simplify-pow-of-two.ll" "/checkout/tests/codegen/simplify-pow-of-two.rs" "--allow-unused-prefixes" "--check-prefixes" "CHECK,NONMSVC" "--dump-input-context" "100"
--- stderr -------------------------------
--- stderr -------------------------------
/checkout/tests/codegen/simplify-pow-of-two.rs:9:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: %_01 = shl nuw i32 %_5, %0
                ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two/simplify-pow-of-two.ll:62:21: note: scanning from here
 %0 = and i32 %a, 31
                    ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two/simplify-pow-of-two.ll:63:2: note: possible intended match here
 %_0 = shl nuw i32 %_5, %0


Input file: /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two/simplify-pow-of-two.ll
Check file: /checkout/tests/codegen/simplify-pow-of-two.rs

-dump-input=help explains the following input dump.
Input was:
<<<<<<
<<<<<<
          1: ; ModuleID = 'simplify_pow_of_two.4441dec6edbdb3c8-cgu.0' 
          2: source_filename = "simplify_pow_of_two.4441dec6edbdb3c8-cgu.0" 
          3: target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 
          4: target triple = "x86_64-unknown-linux-gnu" 
          5:  
          6: @vtable.0 = private unnamed_addr constant <{ ptr, [16 x i8], ptr, ptr, ptr }> <{ ptr @"_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17hac606065f3540a5eE", [16 x i8] c"\08\00\00\00\00\00\00\00\08\00\00\00\00\00\00\00", ptr @"_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h79ed192c4faaaebdE", ptr @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hfcd82d8c95673aadE", ptr @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hfcd82d8c95673aadE" }>, align 8 
          8: ; std::sys_common::backtrace::__rust_begin_short_backtrace 
          8: ; std::sys_common::backtrace::__rust_begin_short_backtrace 
          9: ; Function Attrs: noinline nonlazybind uwtable 
         10: define internal fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h495fb31c495962d1E(ptr nocapture noundef nonnull readonly %f) unnamed_addr #0 { 
         11: start: 
         12:  tail call void %f() 
         13:  tail call void asm sideeffect "", "~{memory}"() #7, !srcloc !4 
         14:  ret void 
         15: } 
         17: ; std::rt::lang_start 
         17: ; std::rt::lang_start 
         18: ; Function Attrs: nonlazybind uwtable 
         19: define hidden noundef i64 @_ZN3std2rt10lang_start17hecfdccb06885adf2E(ptr noundef nonnull %main, i64 noundef %argc, ptr noundef %argv, i8 noundef %sigpipe) unnamed_addr #1 { 
         20: start: 
         21:  %_8 = alloca ptr, align 8 
         22:  call void @llvm.lifetime.start.p0(i64 8, ptr nonnull %_8) 
         23:  store ptr %main, ptr %_8, align 8 
         24: ; call std::rt::lang_start_internal 
         25:  %0 = call noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1 %_8, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @vtable.0, i64 noundef %argc, ptr noundef %argv, i8 noundef %sigpipe) 
         26:  call void @llvm.lifetime.end.p0(i64 8, ptr nonnull %_8) 
         27:  ret i64 %0 
         28: } 
         30: ; std::rt::lang_start::{{closure}} 
         30: ; std::rt::lang_start::{{closure}} 
         31: ; Function Attrs: inlinehint nonlazybind uwtable 
         32: define internal noundef i32 @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hfcd82d8c95673aadE"(ptr noalias nocapture noundef readonly align 8 dereferenceable(8) %_1) unnamed_addr #2 { 
         33: start: 
         34:  %_4 = load ptr, ptr %_1, align 8, !nonnull !5, !noundef !5 
         35: ; call std::sys_common::backtrace::__rust_begin_short_backtrace 
         36:  tail call fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h495fb31c495962d1E(ptr noundef nonnull %_4) 
         37:  ret i32 0 
         38: } 
         40: ; core::ops::function::FnOnce::call_once{{vtable.shim}} 
         40: ; core::ops::function::FnOnce::call_once{{vtable.shim}} 
         41: ; Function Attrs: inlinehint nonlazybind uwtable 
         42: define internal noundef i32 @"_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h79ed192c4faaaebdE"(ptr nocapture noundef readonly %_1) unnamed_addr #2 personality ptr @rust_eh_personality { 
         43: start: 
         44:  %0 = load ptr, ptr %_1, align 8, !nonnull !5, !noundef !5 
         45: ; call std::sys_common::backtrace::__rust_begin_short_backtrace 
         46:  tail call fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h495fb31c495962d1E(ptr noundef nonnull %0), !noalias !6 
         47:  ret i32 0 
         48: } 
         49:  
         50: ; core::ptr::drop_in_place<std::rt::lang_start<()>::{{closure}}> 
         51: ; Function Attrs: inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable 
         52: define internal void @"_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17hac606065f3540a5eE"(ptr noalias nocapture readnone align 8 %_1) unnamed_addr #3 { 
         54:  ret void 
         55: } 
         56:  
         56:  
         57: ; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable 
         58: define dso_local noundef i32 @slow_2_u(i32 noundef %a) unnamed_addr #4 { 
         59: start: 
         60:  %_3 = icmp ult i32 %a, 32 
         61:  %_5 = zext i1 %_3 to i32 
         62:  %0 = and i32 %a, 31 
next:9'0                         X error: no match found
         63:  %_0 = shl nuw i32 %_5, %0 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:9'1      ?                          possible intended match
         64:  ret i32 %_0 
next:9'0     ~~~~~~~~~~~~~
         65: } 
next:9'0     ~~
         66:  
next:9'0     ~
         67: ; simplify_pow_of_two::main 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         68: ; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         69: define internal void @_ZN19simplify_pow_of_two4main17hf9c2c69c8550fc4eE() unnamed_addr #4 { 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         70: start: 
next:9'0     ~~~~~~~
         71:  ret void 
next:9'0     ~~~~~~~~~~
         72: } 
next:9'0     ~~
         73:  
next:9'0     ~
         74: ; std::rt::lang_start_internal 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         75: ; Function Attrs: nonlazybind uwtable 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         76: declare noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1, ptr noalias noundef readonly align 8 dereferenceable(24), i64 noundef, ptr noundef, i8 noundef) unnamed_addr #1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         77:  
next:9'0     ~
         78: ; Function Attrs: nonlazybind uwtable 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         79: declare noundef i32 @rust_eh_personality(i32 noundef, i32 noundef, i64 noundef, ptr noundef, ptr noundef) unnamed_addr #1 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         80:  
next:9'0     ~
         81: ; Function Attrs: nonlazybind 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         82: define i32 @main(i32 %0, ptr %1) unnamed_addr #5 { 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         83: top: 
next:9'0     ~~~~~
         84:  %_8.i = alloca ptr, align 8 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         85:  %2 = sext i32 %0 to i64 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
         86:  call void @llvm.lifetime.start.p0(i64 8, ptr nonnull %_8.i) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         87:  store ptr @_ZN19simplify_pow_of_two4main17hf9c2c69c8550fc4eE, ptr %_8.i, align 8 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         88: ; call std::rt::lang_start_internal 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         89:  %3 = call noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1 %_8.i, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @vtable.0, i64 noundef %2, ptr noundef %1, i8 noundef 0) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         90:  call void @llvm.lifetime.end.p0(i64 8, ptr nonnull %_8.i) 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         91:  %4 = trunc i64 %3 to i32 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
         92:  ret i32 %4 
next:9'0     ~~~~~~~~~~~~
         93: } 
next:9'0     ~~
         94:  
next:9'0     ~
         95: ; Function Attrs: argmemonly mustprogress nocallback nofree nosync nounwind willreturn 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         96: declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #6 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         97:  
next:9'0     ~
         98: ; Function Attrs: argmemonly mustprogress nocallback nofree nosync nounwind willreturn 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         99: declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #6 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        100:  
next:9'0     ~
        101: attributes #0 = { noinline nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        102: attributes #1 = { nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        103: attributes #2 = { inlinehint nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        104: attributes #3 = { inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        105: attributes #4 = { mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        106: attributes #5 = { nonlazybind "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        107: attributes #6 = { argmemonly mustprogress nocallback nofree nosync nounwind willreturn } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        108: attributes #7 = { nounwind } 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        109:  
next:9'0     ~
        110: !llvm.module.flags = !{!0, !1, !2} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        111: !llvm.ident = !{!3} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~
        112:  
next:9'0     ~
        113: !0 = !{i32 7, !"PIC Level", i32 2} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        114: !1 = !{i32 7, !"PIE Level", i32 2} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        115: !2 = !{i32 2, !"RtLibUseGOT", i32 1} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        116: !3 = !{!"rustc version 1.73.0-nightly (228a5fa31 2023-07-30)"} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        117: !4 = !{i32 1254106} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~
        118: !5 = !{} 
next:9'0     ~~~~~~~~~
        119: !6 = !{!7} 
next:9'0     ~~~~~~~~~~~
        120: !7 = distinct !{!7, !8, !"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hfcd82d8c95673aadE: %_1"} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        121: !8 = distinct !{!8, !"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hfcd82d8c95673aadE"} 
next:9'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
------------------------------------------



---- [codegen] tests/codegen/simplify-pow-of-two-debug-assertions.rs stdout ----

error: verification with 'FileCheck' failed
status: exit status: 1
command: "/usr/lib/llvm-15/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two-debug-assertions/simplify-pow-of-two-debug-assertions.ll" "/checkout/tests/codegen/simplify-pow-of-two-debug-assertions.rs" "--allow-unused-prefixes" "--check-prefixes" "CHECK,NONMSVC" "--dump-input-context" "100"
--- stderr -------------------------------
--- stderr -------------------------------
/checkout/tests/codegen/simplify-pow-of-two-debug-assertions.rs:10:17: error: CHECK-NEXT: expected string not found in input
 // CHECK-NEXT: %_01 = shl nuw i32 1, %a
                ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two-debug-assertions/simplify-pow-of-two-debug-assertions.ll:66:5: note: scanning from here
bb1: ; preds = %start
    ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two-debug-assertions/simplify-pow-of-two-debug-assertions.ll:67:2: note: possible intended match here
 %_0 = shl nuw i32 1, %a


Input file: /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/simplify-pow-of-two-debug-assertions/simplify-pow-of-two-debug-assertions.ll
Check file: /checkout/tests/codegen/simplify-pow-of-two-debug-assertions.rs

-dump-input=help explains the following input dump.
Input was:
<<<<<<
<<<<<<
           1: ; ModuleID = 'simplify_pow_of_two_debug_assertions.56b4e7946dc6454d-cgu.0' 
           2: source_filename = "simplify_pow_of_two_debug_assertions.56b4e7946dc6454d-cgu.0" 
           3: target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" 
           4: target triple = "x86_64-unknown-linux-gnu" 
           5:  
           6: @vtable.0 = private unnamed_addr constant <{ ptr, [16 x i8], ptr, ptr, ptr }> <{ ptr @"_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h12e2afd066c01604E", [16 x i8] c"\08\00\00\00\00\00\00\00\08\00\00\00\00\00\00\00", ptr @"_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h98dfb8cb00ea3cd2E", ptr @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hc266d34bcea1ca86E", ptr @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hc266d34bcea1ca86E" }>, align 8 
           7: @alloc_49b224c80e92665d667c35ff8791be32 = private unnamed_addr constant <{ [63 x i8] }> <{ [63 x i8] c"/checkout/tests/codegen/simplify-pow-of-two-debug-assertions.rs" }>, align 1 
           8: @alloc_e63a59136106790b997eb6d19694cd9b = private unnamed_addr constant <{ ptr, [16 x i8] }> <{ ptr @alloc_49b224c80e92665d667c35ff8791be32, [16 x i8] c"?\00\00\00\00\00\00\00\0E\00\00\00\05\00\00\00" }>, align 8 
           9: @str.1 = internal constant [33 x i8] c"attempt to multiply with overflow" 
          11: ; std::sys_common::backtrace::__rust_begin_short_backtrace 
          11: ; std::sys_common::backtrace::__rust_begin_short_backtrace 
          12: ; Function Attrs: noinline nonlazybind uwtable 
          13: define internal fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h0bbff0889484b1b9E(ptr nocapture noundef nonnull readonly %f) unnamed_addr #0 { 
          14: start: 
          15:  tail call void %f() 
          16:  tail call void asm sideeffect "", "~{memory}"() #8, !srcloc !4 
          17:  ret void 
          18: } 
          20: ; std::rt::lang_start 
          20: ; std::rt::lang_start 
          21: ; Function Attrs: nonlazybind uwtable 
          22: define hidden noundef i64 @_ZN3std2rt10lang_start17h2c12d5881901a214E(ptr noundef nonnull %main, i64 noundef %argc, ptr noundef %argv, i8 noundef %sigpipe) unnamed_addr #1 { 
          23: start: 
          24:  %_8 = alloca ptr, align 8 
          25:  call void @llvm.lifetime.start.p0(i64 8, ptr nonnull %_8) 
          26:  store ptr %main, ptr %_8, align 8 
          27: ; call std::rt::lang_start_internal 
          28:  %0 = call noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1 %_8, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @vtable.0, i64 noundef %argc, ptr noundef %argv, i8 noundef %sigpipe) 
          29:  call void @llvm.lifetime.end.p0(i64 8, ptr nonnull %_8) 
          30:  ret i64 %0 
          31: } 
          33: ; std::rt::lang_start::{{closure}} 
          33: ; std::rt::lang_start::{{closure}} 
          34: ; Function Attrs: inlinehint nonlazybind uwtable 
          35: define internal noundef i32 @"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hc266d34bcea1ca86E"(ptr noalias nocapture noundef readonly align 8 dereferenceable(8) %_1) unnamed_addr #2 { 
          36: start: 
          37:  %_4 = load ptr, ptr %_1, align 8, !nonnull !5, !noundef !5 
          38: ; call std::sys_common::backtrace::__rust_begin_short_backtrace 
          39:  tail call fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h0bbff0889484b1b9E(ptr noundef nonnull %_4) 
          40:  ret i32 0 
          41: } 
          43: ; core::ops::function::FnOnce::call_once{{vtable.shim}} 
          43: ; core::ops::function::FnOnce::call_once{{vtable.shim}} 
          44: ; Function Attrs: inlinehint nonlazybind uwtable 
          45: define internal noundef i32 @"_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h98dfb8cb00ea3cd2E"(ptr nocapture noundef readonly %_1) unnamed_addr #2 personality ptr @rust_eh_personality { 
          46: start: 
          47:  %0 = load ptr, ptr %_1, align 8, !nonnull !5, !noundef !5 
          48: ; call std::sys_common::backtrace::__rust_begin_short_backtrace 
          49:  tail call fastcc void @_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h0bbff0889484b1b9E(ptr noundef nonnull %0), !noalias !6 
          50:  ret i32 0 
          51: } 
          52:  
          53: ; core::ptr::drop_in_place<std::rt::lang_start<()>::{{closure}}> 
          54: ; Function Attrs: inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable 
          55: define internal void @"_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h12e2afd066c01604E"(ptr noalias nocapture readnone align 8 %_1) unnamed_addr #3 { 
          57:  ret void 
          58: } 
          59:  
          59:  
          60: ; Function Attrs: nonlazybind uwtable 
          61: define dso_local noundef i32 @slow_2_u(i32 noundef %a) unnamed_addr #1 { 
          62: start: 
          63:  %_3 = icmp ult i32 %a, 32 
          64:  br i1 %_3, label %bb1, label %panic, !prof !9 
          65:  
          66: bb1: ; preds = %start 
next:10'0         X~~~~~~~~~~~~~~~~~ error: no match found
          67:  %_0 = shl nuw i32 1, %a 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
next:10'1      ?                        possible intended match
          68:  ret i32 %_0 
next:10'0     ~~~~~~~~~~~~~
          69:  
next:10'0     ~
          70: panic: ; preds = %start 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~
          71: ; call core::panicking::panic 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          72:  tail call void @_ZN4core9panicking5panic17hce55b48e45b2de89E(ptr noalias noundef nonnull readonly align 1 @str.1, i64 noundef 33, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @alloc_e63a59136106790b997eb6d19694cd9b) #9 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          73:  unreachable 
next:10'0     ~~~~~~~~~~~~~
          74: } 
next:10'0     ~~
          75:  
next:10'0     ~
          76: ; simplify_pow_of_two_debug_assertions::main 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          77: ; Function Attrs: mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          78: define internal void @_ZN36simplify_pow_of_two_debug_assertions4main17h4642517726c52497E() unnamed_addr #4 { 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          79: start: 
next:10'0     ~~~~~~~
          80:  ret void 
next:10'0     ~~~~~~~~~~
          81: } 
next:10'0     ~~
          82:  
next:10'0     ~
          83: ; std::rt::lang_start_internal 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          84: ; Function Attrs: nonlazybind uwtable 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          85: declare noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1, ptr noalias noundef readonly align 8 dereferenceable(24), i64 noundef, ptr noundef, i8 noundef) unnamed_addr #1 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          86:  
next:10'0     ~
          87: ; Function Attrs: nonlazybind uwtable 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          88: declare noundef i32 @rust_eh_personality(i32 noundef, i32 noundef, i64 noundef, ptr noundef, ptr noundef) unnamed_addr #1 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          89:  
next:10'0     ~
          90: ; core::panicking::panic 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          91: ; Function Attrs: cold noinline noreturn nonlazybind uwtable 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          92: declare void @_ZN4core9panicking5panic17hce55b48e45b2de89E(ptr noalias noundef nonnull readonly align 1, i64 noundef, ptr noalias noundef readonly align 8 dereferenceable(24)) unnamed_addr #5 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          93:  
next:10'0     ~
          94: ; Function Attrs: nonlazybind 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          95: define i32 @main(i32 %0, ptr %1) unnamed_addr #6 { 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          96: top: 
next:10'0     ~~~~~
          97:  %_8.i = alloca ptr, align 8 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          98:  %2 = sext i32 %0 to i64 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          99:  call void @llvm.lifetime.start.p0(i64 8, ptr nonnull %_8.i) 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         100:  store ptr @_ZN36simplify_pow_of_two_debug_assertions4main17h4642517726c52497E, ptr %_8.i, align 8 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         101: ; call std::rt::lang_start_internal 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         102:  %3 = call noundef i64 @_ZN3std2rt19lang_start_internal17h748417cf059f6439E(ptr noundef nonnull align 1 %_8.i, ptr noalias noundef nonnull readonly align 8 dereferenceable(24) @vtable.0, i64 noundef %2, ptr noundef %1, i8 noundef 0) 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         103:  call void @llvm.lifetime.end.p0(i64 8, ptr nonnull %_8.i) 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         104:  %4 = trunc i64 %3 to i32 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
         105:  ret i32 %4 
next:10'0     ~~~~~~~~~~~~
         106: } 
next:10'0     ~~
         107:  
next:10'0     ~
         108: ; Function Attrs: argmemonly mustprogress nocallback nofree nosync nounwind willreturn 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         109: declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #7 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         110:  
next:10'0     ~
         111: ; Function Attrs: argmemonly mustprogress nocallback nofree nosync nounwind willreturn 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         112: declare void @llvm.lifetime.end.p0(i64 immarg, ptr nocapture) #7 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         113:  
next:10'0     ~
         114: attributes #0 = { noinline nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         115: attributes #1 = { nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         116: attributes #2 = { inlinehint nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         117: attributes #3 = { inlinehint mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         118: attributes #4 = { mustprogress nofree norecurse nosync nounwind nonlazybind readnone willreturn uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         119: attributes #5 = { cold noinline noreturn nonlazybind uwtable "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         120: attributes #6 = { nonlazybind "probe-stack"="__rust_probestack" "target-cpu"="x86-64" } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         121: attributes #7 = { argmemonly mustprogress nocallback nofree nosync nounwind willreturn } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         122: attributes #8 = { nounwind } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         123: attributes #9 = { noreturn } 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         124:  
next:10'0     ~
         125: !llvm.module.flags = !{!0, !1, !2} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         126: !llvm.ident = !{!3} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~
         127:  
next:10'0     ~
         128: !0 = !{i32 7, !"PIC Level", i32 2} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         129: !1 = !{i32 7, !"PIE Level", i32 2} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         130: !2 = !{i32 2, !"RtLibUseGOT", i32 1} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         131: !3 = !{!"rustc version 1.73.0-nightly (228a5fa31 2023-07-30)"} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         132: !4 = !{i32 1254208} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~
         133: !5 = !{} 
next:10'0     ~~~~~~~~~
         134: !6 = !{!7} 
next:10'0     ~~~~~~~~~~~
         135: !7 = distinct !{!7, !8, !"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hc266d34bcea1ca86E: %_1"} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         136: !8 = distinct !{!8, !"_ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17hc266d34bcea1ca86E"} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         137: !9 = !{!"branch_weights", i32 2000, i32 1} 
next:10'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
------------------------------------------



@Centri3
Copy link
Member Author

Centri3 commented Jul 30, 2023

Not entirely sure what's wrong there. With stage1 it works fine but the output seems to differ on stage2, maybe it needs to be more generic in regards to local names? (I see other codegen tests do this)

Same issue in the mir tests too, but that can be worked around by using only after rather than diff

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Very summary review. I'll get back to this PR in a few days.

@@ -1872,6 +1872,10 @@ impl<'tcx> Region<'tcx> {

/// Constructors for `Ty`
impl<'tcx> Ty<'tcx> {
pub fn new_bool(tcx: TyCtxt<'tcx>) -> Ty<'tcx> {
Ty::new(tcx, TyKind::Bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-interned as tcx.types.bool

@@ -546,6 +547,7 @@ fn run_optimization_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
&lower_slice_len::LowerSliceLenCalls, // has to be done before inlining, otherwise actual call will be almost always inlined. Also simple, so can just do first
&unreachable_prop::UnreachablePropagation,
&uninhabited_enum_branching::UninhabitedEnumBranching,
&simplify_pow_of_two::SimplifyPowOfTwo,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably go after const prop, to increase the likelihood to have constants.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's after inlining, with the current way it works there's a chance the inliner will inline the call, and thus we won't detect it.

// already entirely optimized away
&& power_used != 0.0
// `-inf` would be `0.pow()`
&& power_used.is_finite()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split this huge chain into 'let ... else { continue}' ?

..
} = &term.kind
&& let Some(def_id) = func.const_fn_def().map(|def| def.0)
&& let def_path = tcx.def_path(def_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the Def path ? The crate is already 'def_id.krate'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artifact from how it worked before

}
&& let power_used = f32::log2(recv_val as f32)
// Precision loss means it's not a power of two
&& power_used == (power_used as u32) as f32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use floats instead of ilog2?

Copy link
Member Author

@Centri3 Centri3 Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's unfortunately no distinction between whether it's power of two there. i32::ilog2(4) and i32::log2(5) both return 2.

Oh nvm, this can almost certainly be followed by a call to 2.pow and see if that matches the original value

recv_ty,
) = recv_const.literal
&& recv_ty.is_integral()
&& tcx.item_name(def_id) == sym::pow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This detection mecanism is brittle. The compiler should be as independent as possible from the exact paths in core. Can diagnostic items be used for this ?

Copy link
Member Author

@Centri3 Centri3 Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with it, but if so we should use lang items instead as pointed out by #114254 (comment). But this feels like it'd be very verbose and slow, as we'd need 12 lang items for them all and then check whether it's any (rather than just if the item name is pow).

@oli-obk
Copy link
Contributor

oli-obk commented Jul 31, 2023

This is already performed by LLVM, both with debug assertions and without (even at -Copt-level=1): https://rust.godbolt.org/z/ssGz4Gz13

I don't believe the complexity is worth it in a MIR opt, unless we can show it significantly improves debug build performance. Considering pow opts are rare, I think this is probably not an issue. We may want to just remove the clippy lint alltogether.

@Centri3
Copy link
Member Author

Centri3 commented Jul 31, 2023

That's only if it's a constant: https://rust.godbolt.org/z/Pcfd8YPz4, if it's not it won't be at all optimized away. So this would also majorly improve performance on release mode if a pow is used like this. Getting a power of two like this from a non-constant isn't all too uncommon.

The more realistic case of https://rust.godbolt.org/z/Eezfaoa7W is still quite a lot longer.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 31, 2023

In case it is not a constant, your optimization also does not apply, right?

@Centri3
Copy link
Member Author

Centri3 commented Jul 31, 2023

It does apply if rhs is not a constant. (Not lhs though, but that's very rare and would need to call the default pow anyway if it's not a power of two)

@oli-obk
Copy link
Contributor

oli-obk commented Jul 31, 2023

🤦 ok, I completely misread this opt. Let me wake up and try again 😆 sorry

@oli-obk
Copy link
Contributor

oli-obk commented Jul 31, 2023

I don't know much about LLVM though. Maybe it already optimizes its own pow intrinsic like this already, I'm not sure. If it does we can switch the implementation of pow to just use that, alongside a new pow intrinsic

LLVM only has float pow intrinsics, no integer pow intrinsics. We'd have to finely tune our pow impl to match whatever LLVM expects, and as you noted that is very fragile.

For float math the peephole opts LLVM does can be found in https://llvm.org/doxygen/SimplifyLibCalls_8cpp_source.html (search for optimizePow).

https://stackoverflow.com/questions/2398442/why-isnt-int-powint-base-int-exponent-in-the-standard-c-libraries is very enlightening on why integer power ops don't exist in C/C++, so I'd assume LLVM doesn't really have any optimizations for it.

256u32.pow(a)
}

// EMIT_MIR simplify_pow_of_two_no_overflow_checks.slow_256_i.SimplifyPowOfTwo.after.mir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// EMIT_MIR simplify_pow_of_two_no_overflow_checks.slow_256_i.SimplifyPowOfTwo.after.mir
// EMIT_MIR simplify_pow_of_two_no_overflow_checks.slow_256_i.SimplifyPowOfTwo.diff

makes it easier for us to see what the optimization does

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: ah, but it doesn't really optimize away anything, just replace a function... oh well 🤷

Copy link
Member Author

@Centri3 Centri3 Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wanted to use diff as well, just for clarity's sake though it seems from stage1 to stage2 it changes an unwind unreachable to unwind continue. It's the reason why CI was failing before. Any idea why that happens/how it can be prevented? Because I'd definitely prefer diff, even just for the highlighting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably not a stage1/stage2 difference, but a problem with the fact that we are running mir-opt tests for panic=unwind and panic=abort and are expecting the results to match. You can use // EMIT_MIR_FOR_EACH_PANIC_STRATEGY to generate separate output files

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, that makes sense, thanks!

@cjgillot cjgillot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. A-mir-opt Area: MIR optimizations and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 2, 2023
@Centri3 Centri3 closed this Aug 2, 2023
@Centri3 Centri3 deleted the simplify-pow-of-two branch August 2, 2023 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.