Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get LLVM to stop generating dead assembly in next_power_of_two #42556

Merged
merged 2 commits into from
Jun 10, 2017

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Jun 9, 2017

It turns out that LLVM can turn @llvm.ctlz.i64(_, true) into @llvm.ctlz.i64(_, false) (ctlz) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases.

A contrived-but-clear example https://is.gd/VAIKuC:

fn foo(x:u64) -> u32 {
    if x == 0 { return !0; }
    x.leading_zeros()
}

Generates

	testq	%rdi, %rdi
	je	.LBB0_1
	je	.LBB0_3    ; <-- wha?
	bsrq	%rdi, %rax
	xorq	$63, %rax
	retq
.LBB0_1:
	movl	$-1, %eax
	retq
.LBB0_3:
	movl	$64, %eax  ; <-- dead
	retq

I noticed this in next_power_of_two, which without this PR generates the following:

	cmpq	$2, %rcx
	jae	.LBB1_2
	movl	$1, %eax
	retq
.LBB1_2:
	decq	%rcx
	je	.LBB1_3
	bsrq	%rcx, %rcx
	xorq	$63, %rcx
	jmp	.LBB1_5
.LBB1_3:
	movl	$64, %ecx  ; <-- dead
.LBB1_5:
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq

And with this PR becomes

	cmpq	$2, %rcx
	jae	.LBB0_2
	movl	$1, %eax
	retq
.LBB0_2:
	decq	%rcx
	bsrq	%rcx, %rcx
	xorl	$63, %ecx
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq

@rust-highfive
Copy link
Contributor

r? @BurntSushi

(rust_highfive has picked a reviewer for you, use r? to override)

@@ -280,6 +280,11 @@ pub fn trans_intrinsic_call<'a, 'tcx>(bcx: &Builder<'a, 'tcx>,
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", name, width));
bcx.call(llfn, &[llargs[0], y], None)
}
"ctlz_nonzero" | "cttz_nonzero" => {
let y = C_bool(bcx.ccx, true);
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", &name[..4], width));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[00:03:22] tidy error: /checkout/src/librustc_trans/intrinsic.rs:285: line longer than 100 chars

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup; currently waiting on local double-check for the fix. Thanks for the targeted comment!

scottmcm added 2 commits June 8, 2017 23:01
LLVM currently doesn't remove the "bypass if argument is zero" assembly inside branches where the value is known to be non-zero, pessimizing code that uses uN::leading_zeros
@BurntSushi
Copy link
Member

LGTM. Thanks! @bors r+

@bors
Copy link
Collaborator

bors commented Jun 9, 2017

📌 Commit 6d86f0c has been approved by BurntSushi

@shepmaster shepmaster added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jun 9, 2017
@bors
Copy link
Collaborator

bors commented Jun 10, 2017

⌛ Testing commit 6d86f0c with merge 995f741...

bors added a commit that referenced this pull request Jun 10, 2017
Get LLVM to stop generating dead assembly in next_power_of_two

It turns out that LLVM can turn `@llvm.ctlz.i64(_, true)` into `@llvm.ctlz.i64(_, false)` ([`ctlz`](http://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic)) where valuable, but never does the opposite.  That leads to some silly assembly getting generated in certain cases.

A contrived-but-clear example https://is.gd/VAIKuC:
```rust
fn foo(x:u64) -> u32 {
    if x == 0 { return !0; }
    x.leading_zeros()
}
```
Generates
```asm
	testq	%rdi, %rdi
	je	.LBB0_1
	je	.LBB0_3    ; <-- wha?
	bsrq	%rdi, %rax
	xorq	$63, %rax
	retq
.LBB0_1:
	movl	$-1, %eax
	retq
.LBB0_3:
	movl	$64, %eax  ; <-- dead
	retq
```

I noticed this in `next_power_of_two`, which without this PR generates the following:
```asm
	cmpq	$2, %rcx
	jae	.LBB1_2
	movl	$1, %eax
	retq
.LBB1_2:
	decq	%rcx
	je	.LBB1_3
	bsrq	%rcx, %rcx
	xorq	$63, %rcx
	jmp	.LBB1_5
.LBB1_3:
	movl	$64, %ecx  ; <-- dead
.LBB1_5:
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq
```

And with this PR becomes
```asm
	cmpq	$2, %rcx
	jae	.LBB0_2
	movl	$1, %eax
	retq
.LBB0_2:
	decq	%rcx
	bsrq	%rcx, %rcx
	xorl	$63, %ecx
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq
```
@bors
Copy link
Collaborator

bors commented Jun 10, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing 995f741 to master...

@bors bors merged commit 6d86f0c into rust-lang:master Jun 10, 2017
@scottmcm scottmcm deleted the ctz-nz branch June 21, 2017 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants