Also get `add nuw` from `uN::checked_add` by scottmcm · Pull Request #126852 · rust-lang/rust

scottmcm · 2024-06-23T06:09:02Z

When I was doing this for checked_{sub,shl,shr}, it was mentioned #124114 (comment) that it'd be worth trying for checked_add too.

It makes a particularly-big difference for x.checked_add(C), as doing this means that LLVM removes the intrinsic and does it as a normal x <= MAX - C instead.

cc @dianqk who had commented about checked_add related to rust-lang/hashbrown#509 before

cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself

rustbot · 2024-06-23T06:09:10Z

r? @Amanieu

rustbot has assigned @Amanieu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

scottmcm · 2024-06-23T06:20:09Z

@bors try @rust-timer queue

bors · 2024-06-23T06:21:19Z

⌛ Trying commit f5db46c with merge ffcb98e...

bors · 2024-06-23T08:02:53Z

☀️ Try build successful - checks-actions
Build commit: ffcb98e (ffcb98e5fcbb87cf17319741ada04fa274e087f5)

rust-timer · 2024-06-23T09:22:39Z

Finished benchmarking commit (ffcb98e): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.9%	[0.5%, 1.4%]	3
Regressions ❌ (secondary)	0.4%	[0.4%, 0.5%]	6
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-0.9%	[-0.9%, -0.8%]	6
All ❌✅ (primary)	0.6%	[-0.3%, 1.4%]	4

Max RSS (memory usage)

Results (primary -0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	7.9%	[7.9%, 7.9%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-3.2%	[-6.9%, -0.7%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-6.9%, 7.9%]	4

Cycles

Results (primary 1.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[1.2%, 2.4%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.8%	[1.2%, 2.4%]	2

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.1%, 0.8%]	3
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.2%	[-0.6%, -0.0%]	61
Improvements ✅ (secondary)	-0.1%	[-1.0%, -0.0%]	42
All ❌✅ (primary)	-0.1%	[-0.6%, 0.8%]	64

Bootstrap: 692.678s -> 694.501s (0.26%)
Artifact size: 326.70 MiB -> 326.94 MiB (0.07%)

scottmcm · 2024-06-23T17:29:07Z

@bors try @rust-timer queue

bors · 2024-06-23T17:30:18Z

⌛ Trying commit a53dd2f with merge 3c488df...

bors · 2024-06-23T19:11:21Z

☀️ Try build successful - checks-actions
Build commit: 3c488df (3c488dfa532496633c17af863407e9023e834af4)

rust-timer · 2024-06-23T20:26:42Z

Finished benchmarking commit (3c488df): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.3%, 0.9%]	3
Regressions ❌ (secondary)	0.4%	[0.4%, 0.5%]	6
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-0.9%	[-0.9%, -0.8%]	6
All ❌✅ (primary)	0.4%	[-0.3%, 0.9%]	4

Max RSS (memory usage)

Results (primary 0.2%, secondary -4.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	8.6%	[2.8%, 14.4%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-5.4%	[-8.7%, -2.5%]	3
Improvements ✅ (secondary)	-4.8%	[-4.8%, -4.8%]	1
All ❌✅ (primary)	0.2%	[-8.7%, 14.4%]	5

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.6%]	6
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.1%	[-0.6%, -0.0%]	56
Improvements ✅ (secondary)	-0.1%	[-0.8%, -0.0%]	42
All ❌✅ (primary)	-0.1%	[-0.6%, 0.6%]	62

Bootstrap: 691.47s -> 694.315s (0.41%)
Artifact size: 326.79 MiB -> 326.84 MiB (0.02%)

scottmcm · 2024-06-23T20:33:54Z

(Last force-push didn't change anything, just cleaned up history, so the #126852 (comment) run is still accurate)

Ok, I think this is ready. It's maybe a small instruction regression, but shows neutral on cycles, while also being a broad size win. So overall I think that's completely acceptable for improved optimization potential.

scottmcm · 2024-06-23T20:35:38Z

+    // CHECK: %[[IS_MAX:.+]] = icmp eq i32 %x, -1
+    // CHECK: br i1 %[[IS_MAX]], label %[[NONE_BB:.+]], label %[[SOME_BB:.+]]
+    // CHECK: [[SOME_BB]]:
+    // CHECK: %[[R:.+]] = add nuw i32 %x, 1


Compare to the current codegen in https://rust.godbolt.org/z/vcPv8xse3 where this returns a wrapping addition, via llvm.uadd.with.overflow.

Amanieu · 2024-06-24T09:14:02Z

@bors r+

bors · 2024-06-24T09:14:04Z

📌 Commit ec9e356 has been approved by Amanieu

It is now in the queue for this repository.

bors · 2024-06-24T10:22:04Z

⌛ Testing commit ec9e356 with merge 1106161...

rust-log-analyzer · 2024-06-24T11:27:47Z

The job x86_64-mingw failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

failures:

---- [ui] tests\ui\sepcomp\sepcomp-lib-lto.rs stdout ----

error: auxiliary build of "C:\\a\\rust\\rust\\tests\\ui\\sepcomp\\auxiliary\\sepcomp_lib.rs" failed to compile: 
status: exit code: 1
command: PATH="C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage2\bin;C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage0-bootstrap-tools\x86_64-pc-windows-gnu\release\deps;C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage0\bin;C:\a\_temp\msys64\mingw64\bin;C:\a\_temp\msys64\usr\local\bin;C:\a\_temp\msys64\usr\bin;C:\a\_temp\msys64\usr\bin;C:\a\rust\rust\ninja;C:\a\rust\rust\mingw64\bin;C:\a\rust\rust\sccache;C:\a\_temp\setup-msys2;C:\Program Files\MongoDB\Server\5.0\bin;C:\aliyun-cli;C:\vcpkg;C:\Program Files (x86)\NSIS;C:\tools\zstd;C:\Program Files\Mercurial;C:\hostedtoolcache\windows\stack\2.15.7\x64;C:\cabal\bin;C:\ghcup\bin;C:\mingw64\bin;C:\Program Files\dotnet;C:\Program Files\MySQL\MySQL Server 8.0\bin;C:\Program Files\R\R-4.4.1\bin\x64;C:\SeleniumWebDrivers\GeckoDriver;C:\SeleniumWebDrivers\EdgeDriver;C:\SeleniumWebDrivers\ChromeDriver;C:\Program Files (x86)\sbt\bin;C:\Program Files (x86)\GitHub CLI;C:\Program Files\Git\bin;C:\Program Files (x86)\pipx_bin;C:\npm\prefix;C:\hostedtoolcache\windows\go\1.21.11\x64\bin;C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts;C:\hostedtoolcache\windows\Python\3.9.13\x64;C:\hostedtoolcache\windows\Ruby\3.0.7\x64\bin;C:\Program Files\OpenSSL\bin;C:\tools\kotlinc\bin;C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\8.0.412-8\x64\bin;C:\Program Files\ImageMagick-7.1.1-Q16-HDRI;C:\Program Files\Microsoft SDKs\Azure\CLI2\wbin;C:\ProgramData\kind;C:\ProgramData\docker-compose;C:\ProgramData\Chocolatey\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\dotnet;C:\Program Files\PowerShell\7;C:\Program Files\Microsoft\Web Platform Installer;C:\Program Files\TortoiseSVN\bin;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn;C:\Program Files\Microsoft SQL Server\150\Tools\Binn;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;C:\Program Files (x86)\WiX Toolset v3.14\bin;C:\Program Files\Microsoft SQL Server\130\DTS\Binn;C:\Program Files\Microsoft SQL Server\140\DTS\Binn;C:\Program Files\Microsoft SQL Server\150\DTS\Binn;C:\Program Files\Microsoft SQL Server\160\DTS\Binn;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\ProgramData\chocolatey\lib\pulumi\tools\Pulumi\bin;C:\Program Files\CMake\bin;C:\ProgramData\chocolatey\lib\maven\apache-maven-3.8.7\bin;C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code;C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager;C:\Program Files\nodejs;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;C:\Program Files\GitHub CLI;C:\tools\php;C:\Program Files (x86)\sbt\bin;C:\Program Files\Amazon\AWSCLIV2;C:\Program Files\Amazon\SessionManagerPlugin\bin;C:\Program Files\Amazon\AWSSAMCLI\bin;C:\Program Files\Microsoft SQL Server\130\Tools\Binn;C:\Program Files\LLVM\bin;C:\Users\runneradmin\.dotnet\tools;C:\Users\runneradmin\.cargo\bin;C:\Users\runneradmin\AppData\Local\Microsoft\WindowsApps;C:\a\_temp\msys64\usr\bin\site_perl;C:\a\_temp\msys64\usr\bin\vendor_perl;C:\a\_temp\msys64\usr\bin\core_perl" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\stage2\\bin\\rustc.exe" "C:\\a\\rust\\rust\\tests\\ui\\sepcomp\\auxiliary\\sepcomp_lib.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=C:\\Users\\runneradmin\\.cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=C:\\a\\rust\\rust\\vendor" "--sysroot" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\stage2" "--target=x86_64-pc-windows-gnu" "--check-cfg" "cfg(FALSE)" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "-C" "prefer-dynamic" "--out-dir" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\test\\ui\\sepcomp\\sepcomp-lib-lto\\auxiliary" "-A" "internal_features" "-Crpath" "-Lnative=C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\native\\rust-test-helpers" "-C" "codegen-units=3" "--crate-type=rlib,dylib" "-g" "--crate-type" "dylib" "-L" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\test\\ui\\sepcomp\\sepcomp-lib-lto\\auxiliary"
--- stderr -------------------------------
--- stderr -------------------------------
error: couldn't create a temp dir: Access is denied. (os error 5) at path "C:\\a\\_temp\\msys64\\tmp\\rustcPOlAdi"
error: aborting due to 1 previous error
------------------------------------------

bors · 2024-06-24T11:28:27Z

💔 Test failed - checks-actions

Amanieu · 2024-06-24T13:41:39Z

@bors retry

bors · 2024-06-25T02:50:40Z

⌛ Testing commit ec9e356 with merge fc555cd...

ivan-shrimp · 2024-06-25T04:40:38Z

It seems that this actually pessimizes the result; see https://rust.godbolt.org/z/c1qebWbxM.

LLVM is not good at recombining the icmp and add/sub. Notice that the original code correctly takes advantage of the zero flag from the inc instruction, while the new code needs to do the cmp and inc separately.

(The corresponding PR for checked_sub had a similar problem, and I submitted a fix for it.)

bors · 2024-06-25T05:08:22Z

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing fc555cd to master...

rust-timer · 2024-06-25T06:24:03Z

Finished benchmarking commit (fc555cd): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.3%, 0.9%]	4
Regressions ❌ (secondary)	0.4%	[0.3%, 0.4%]	2
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-1.3%	[-1.4%, -0.9%]	7
All ❌✅ (primary)	0.4%	[-0.3%, 0.9%]	5

Max RSS (memory usage)

Results (primary 1.4%, secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	9.6%	[5.6%, 13.6%]	2
Regressions ❌ (secondary)	2.2%	[2.2%, 2.2%]	1
Improvements ✅ (primary)	-6.8%	[-8.6%, -5.0%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.4%	[-8.6%, 13.6%]	4

Cycles

Results (primary 3.9%, secondary -3.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.9%	[3.9%, 3.9%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.5%	[-3.9%, -3.1%]	2
All ❌✅ (primary)	3.9%	[3.9%, 3.9%]	1

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.6%]	6
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	2
Improvements ✅ (primary)	-0.1%	[-0.6%, -0.0%]	53
Improvements ✅ (secondary)	-0.1%	[-0.8%, -0.0%]	42
All ❌✅ (primary)	-0.1%	[-0.6%, 0.6%]	59

Bootstrap: 693.576s -> 693.565s (-0.00%)
Artifact size: 326.73 MiB -> 326.74 MiB (0.00%)

pnkfelix · 2024-07-02T19:19:22Z

It seems that this actually pessimizes the result; see https://rust.godbolt.org/z/c1qebWbxM.

LLVM is not good at recombining the icmp and add/sub. Notice that the original code correctly takes advantage of the zero flag from the inc instruction, while the new code needs to do the cmp and inc separately.

(The corresponding PR for checked_sub had a similar problem, and I submitted a fix for it.)

Hey @scottmcm , did you see this comment about this PR? Do you have thoughts about the claim that this is a pessimization, at least on x86 it seems?

pnkfelix · 2024-07-02T19:21:08Z

visiting for weekly rustc-perf triage

PR was analyzed and thought to be a net win, despite the anticipated regression to compiler instruction-counts
but there was a bystander follow-up comment that the result here might be a pessimization, at least for Intel x86.
not marking as triaged, in hopes that follow-up comment gets addressed in some manner.

rustbot assigned Amanieu Jun 23, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 23, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 23, 2024

scottmcm marked this pull request as ready for review June 23, 2024 09:30

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024

Also get add nuw from uN::checked_add

ec9e356

scottmcm force-pushed the more-checked-math-tweaks branch from a53dd2f to ec9e356 Compare June 23, 2024 20:29

scottmcm commented Jun 23, 2024

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 24, 2024

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 24, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 24, 2024

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 25, 2024

bors merged commit fc555cd into rust-lang:master Jun 25, 2024

rustbot added this to the 1.81.0 milestone Jun 25, 2024

scottmcm deleted the more-checked-math-tweaks branch June 29, 2024 08:40

moxian mentioned this pull request Mar 28, 2025

Multiple bounds checking elision failures #120433

Closed

Uh oh!

Conversation

scottmcm commented Jun 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jun 23, 2024

Uh oh!

scottmcm commented Jun 23, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 23, 2024

Uh oh!

bors commented Jun 23, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jun 23, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

scottmcm commented Jun 23, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 23, 2024

Uh oh!

bors commented Jun 23, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jun 23, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

scottmcm commented Jun 23, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Amanieu commented Jun 24, 2024

Uh oh!

bors commented Jun 24, 2024

Uh oh!

bors commented Jun 24, 2024

Uh oh!

rust-log-analyzer commented Jun 24, 2024

Uh oh!

bors commented Jun 24, 2024

Uh oh!

Amanieu commented Jun 24, 2024

Uh oh!

bors commented Jun 25, 2024

Uh oh!

ivan-shrimp commented Jun 25, 2024

Uh oh!

bors commented Jun 25, 2024

Uh oh!

rust-timer commented Jun 25, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

pnkfelix commented Jul 2, 2024

Uh oh!

pnkfelix commented Jul 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

scottmcm commented Jun 23, 2024 •

edited

Loading