-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get LLVM to stop generating dead assembly in next_power_of_two #42556
Conversation
r? @BurntSushi (rust_highfive has picked a reviewer for you, use r? to override) |
src/librustc_trans/intrinsic.rs
Outdated
@@ -280,6 +280,11 @@ pub fn trans_intrinsic_call<'a, 'tcx>(bcx: &Builder<'a, 'tcx>, | |||
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", name, width)); | |||
bcx.call(llfn, &[llargs[0], y], None) | |||
} | |||
"ctlz_nonzero" | "cttz_nonzero" => { | |||
let y = C_bool(bcx.ccx, true); | |||
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", &name[..4], width)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[00:03:22] tidy error: /checkout/src/librustc_trans/intrinsic.rs:285: line longer than 100 chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup; currently waiting on local double-check for the fix. Thanks for the targeted comment!
LLVM currently doesn't remove the "bypass if argument is zero" assembly inside branches where the value is known to be non-zero, pessimizing code that uses uN::leading_zeros
LGTM. Thanks! @bors r+ |
📌 Commit 6d86f0c has been approved by |
Get LLVM to stop generating dead assembly in next_power_of_two It turns out that LLVM can turn `@llvm.ctlz.i64(_, true)` into `@llvm.ctlz.i64(_, false)` ([`ctlz`](http://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic)) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases. A contrived-but-clear example https://is.gd/VAIKuC: ```rust fn foo(x:u64) -> u32 { if x == 0 { return !0; } x.leading_zeros() } ``` Generates ```asm testq %rdi, %rdi je .LBB0_1 je .LBB0_3 ; <-- wha? bsrq %rdi, %rax xorq $63, %rax retq .LBB0_1: movl $-1, %eax retq .LBB0_3: movl $64, %eax ; <-- dead retq ``` I noticed this in `next_power_of_two`, which without this PR generates the following: ```asm cmpq $2, %rcx jae .LBB1_2 movl $1, %eax retq .LBB1_2: decq %rcx je .LBB1_3 bsrq %rcx, %rcx xorq $63, %rcx jmp .LBB1_5 .LBB1_3: movl $64, %ecx ; <-- dead .LBB1_5: movq $-1, %rax shrq %cl, %rax incq %rax retq ``` And with this PR becomes ```asm cmpq $2, %rcx jae .LBB0_2 movl $1, %eax retq .LBB0_2: decq %rcx bsrq %rcx, %rcx xorl $63, %ecx movq $-1, %rax shrq %cl, %rax incq %rax retq ```
☀️ Test successful - status-appveyor, status-travis |
It turns out that LLVM can turn
@llvm.ctlz.i64(_, true)
into@llvm.ctlz.i64(_, false)
(ctlz
) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases.A contrived-but-clear example https://is.gd/VAIKuC:
Generates
I noticed this in
next_power_of_two
, which without this PR generates the following:And with this PR becomes