Skip to content

Also get add nuw from uN::checked_add#126852

Merged
bors merged 1 commit into
rust-lang:masterfrom
scottmcm:more-checked-math-tweaks
Jun 25, 2024
Merged

Also get add nuw from uN::checked_add#126852
bors merged 1 commit into
rust-lang:masterfrom
scottmcm:more-checked-math-tweaks

Conversation

@scottmcm
Copy link
Copy Markdown
Member

@scottmcm scottmcm commented Jun 23, 2024

When I was doing this for checked_{sub,shl,shr}, it was mentioned #124114 (comment) that it'd be worth trying for checked_add too.

It makes a particularly-big difference for x.checked_add(C), as doing this means that LLVM removes the intrinsic and does it as a normal x <= MAX - C instead.

cc @dianqk who had commented about checked_add related to rust-lang/hashbrown#509 before

cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Jun 23, 2024

r? @Amanieu

rustbot has assigned @Amanieu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 23, 2024
@scottmcm
Copy link
Copy Markdown
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 23, 2024

⌛ Trying commit f5db46c with merge ffcb98e...

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 23, 2024

☀️ Try build successful - checks-actions
Build commit: ffcb98e (ffcb98e5fcbb87cf17319741ada04fa274e087f5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (ffcb98e): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.5%, 1.4%] 3
Regressions ❌
(secondary)
0.4% [0.4%, 0.5%] 6
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
-0.9% [-0.9%, -0.8%] 6
All ❌✅ (primary) 0.6% [-0.3%, 1.4%] 4

Max RSS (memory usage)

Results (primary -0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
7.9% [7.9%, 7.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-3.2% [-6.9%, -0.7%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.5% [-6.9%, 7.9%] 4

Cycles

Results (primary 1.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.8% [1.2%, 2.4%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.8% [1.2%, 2.4%] 2

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.1%, 0.8%] 3
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
-0.2% [-0.6%, -0.0%] 61
Improvements ✅
(secondary)
-0.1% [-1.0%, -0.0%] 42
All ❌✅ (primary) -0.1% [-0.6%, 0.8%] 64

Bootstrap: 692.678s -> 694.501s (0.26%)
Artifact size: 326.70 MiB -> 326.94 MiB (0.07%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 23, 2024
@scottmcm scottmcm marked this pull request as ready for review June 23, 2024 09:30
@scottmcm
Copy link
Copy Markdown
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 23, 2024

⌛ Trying commit a53dd2f with merge 3c488df...

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 23, 2024

☀️ Try build successful - checks-actions
Build commit: 3c488df (3c488dfa532496633c17af863407e9023e834af4)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (3c488df): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.3%, 0.9%] 3
Regressions ❌
(secondary)
0.4% [0.4%, 0.5%] 6
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
-0.9% [-0.9%, -0.8%] 6
All ❌✅ (primary) 0.4% [-0.3%, 0.9%] 4

Max RSS (memory usage)

Results (primary 0.2%, secondary -4.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
8.6% [2.8%, 14.4%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-5.4% [-8.7%, -2.5%] 3
Improvements ✅
(secondary)
-4.8% [-4.8%, -4.8%] 1
All ❌✅ (primary) 0.2% [-8.7%, 14.4%] 5

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.6%] 6
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
-0.1% [-0.6%, -0.0%] 56
Improvements ✅
(secondary)
-0.1% [-0.8%, -0.0%] 42
All ❌✅ (primary) -0.1% [-0.6%, 0.6%] 62

Bootstrap: 691.47s -> 694.315s (0.41%)
Artifact size: 326.79 MiB -> 326.84 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 23, 2024
@scottmcm scottmcm force-pushed the more-checked-math-tweaks branch from a53dd2f to ec9e356 Compare June 23, 2024 20:29
@scottmcm
Copy link
Copy Markdown
Member Author

(Last force-push didn't change anything, just cleaned up history, so the #126852 (comment) run is still accurate)

Ok, I think this is ready. It's maybe a small instruction regression, but shows neutral on cycles, while also being a broad size win. So overall I think that's completely acceptable for improved optimization potential.

// CHECK: %[[IS_MAX:.+]] = icmp eq i32 %x, -1
// CHECK: br i1 %[[IS_MAX]], label %[[NONE_BB:.+]], label %[[SOME_BB:.+]]
// CHECK: [[SOME_BB]]:
// CHECK: %[[R:.+]] = add nuw i32 %x, 1
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare to the current codegen in https://rust.godbolt.org/z/vcPv8xse3 where this returns a wrapping addition, via llvm.uadd.with.overflow.

@Amanieu
Copy link
Copy Markdown
Member

Amanieu commented Jun 24, 2024

@bors r+

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 24, 2024

📌 Commit ec9e356 has been approved by Amanieu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 24, 2024
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 24, 2024

⌛ Testing commit ec9e356 with merge 1106161...

@rust-log-analyzer
Copy link
Copy Markdown
Collaborator

The job x86_64-mingw failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
failures:

---- [ui] tests\ui\sepcomp\sepcomp-lib-lto.rs stdout ----

error: auxiliary build of "C:\\a\\rust\\rust\\tests\\ui\\sepcomp\\auxiliary\\sepcomp_lib.rs" failed to compile: 
status: exit code: 1
command: PATH="C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage2\bin;C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage0-bootstrap-tools\x86_64-pc-windows-gnu\release\deps;C:\a\rust\rust\build\x86_64-pc-windows-gnu\stage0\bin;C:\a\_temp\msys64\mingw64\bin;C:\a\_temp\msys64\usr\local\bin;C:\a\_temp\msys64\usr\bin;C:\a\_temp\msys64\usr\bin;C:\a\rust\rust\ninja;C:\a\rust\rust\mingw64\bin;C:\a\rust\rust\sccache;C:\a\_temp\setup-msys2;C:\Program Files\MongoDB\Server\5.0\bin;C:\aliyun-cli;C:\vcpkg;C:\Program Files (x86)\NSIS;C:\tools\zstd;C:\Program Files\Mercurial;C:\hostedtoolcache\windows\stack\2.15.7\x64;C:\cabal\bin;C:\ghcup\bin;C:\mingw64\bin;C:\Program Files\dotnet;C:\Program Files\MySQL\MySQL Server 8.0\bin;C:\Program Files\R\R-4.4.1\bin\x64;C:\SeleniumWebDrivers\GeckoDriver;C:\SeleniumWebDrivers\EdgeDriver;C:\SeleniumWebDrivers\ChromeDriver;C:\Program Files (x86)\sbt\bin;C:\Program Files (x86)\GitHub CLI;C:\Program Files\Git\bin;C:\Program Files (x86)\pipx_bin;C:\npm\prefix;C:\hostedtoolcache\windows\go\1.21.11\x64\bin;C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts;C:\hostedtoolcache\windows\Python\3.9.13\x64;C:\hostedtoolcache\windows\Ruby\3.0.7\x64\bin;C:\Program Files\OpenSSL\bin;C:\tools\kotlinc\bin;C:\hostedtoolcache\windows\Java_Temurin-Hotspot_jdk\8.0.412-8\x64\bin;C:\Program Files\ImageMagick-7.1.1-Q16-HDRI;C:\Program Files\Microsoft SDKs\Azure\CLI2\wbin;C:\ProgramData\kind;C:\ProgramData\docker-compose;C:\ProgramData\Chocolatey\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\dotnet;C:\Program Files\PowerShell\7;C:\Program Files\Microsoft\Web Platform Installer;C:\Program Files\TortoiseSVN\bin;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\170\Tools\Binn;C:\Program Files\Microsoft SQL Server\150\Tools\Binn;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;C:\Program Files (x86)\WiX Toolset v3.14\bin;C:\Program Files\Microsoft SQL Server\130\DTS\Binn;C:\Program Files\Microsoft SQL Server\140\DTS\Binn;C:\Program Files\Microsoft SQL Server\150\DTS\Binn;C:\Program Files\Microsoft SQL Server\160\DTS\Binn;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\ProgramData\chocolatey\lib\pulumi\tools\Pulumi\bin;C:\Program Files\CMake\bin;C:\ProgramData\chocolatey\lib\maven\apache-maven-3.8.7\bin;C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code;C:\Program Files\Microsoft SDKs\Service Fabric\Tools\ServiceFabricLocalClusterManager;C:\Program Files\nodejs;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;C:\Program Files\GitHub CLI;C:\tools\php;C:\Program Files (x86)\sbt\bin;C:\Program Files\Amazon\AWSCLIV2;C:\Program Files\Amazon\SessionManagerPlugin\bin;C:\Program Files\Amazon\AWSSAMCLI\bin;C:\Program Files\Microsoft SQL Server\130\Tools\Binn;C:\Program Files\LLVM\bin;C:\Users\runneradmin\.dotnet\tools;C:\Users\runneradmin\.cargo\bin;C:\Users\runneradmin\AppData\Local\Microsoft\WindowsApps;C:\a\_temp\msys64\usr\bin\site_perl;C:\a\_temp\msys64\usr\bin\vendor_perl;C:\a\_temp\msys64\usr\bin\core_perl" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\stage2\\bin\\rustc.exe" "C:\\a\\rust\\rust\\tests\\ui\\sepcomp\\auxiliary\\sepcomp_lib.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=C:\\Users\\runneradmin\\.cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=C:\\a\\rust\\rust\\vendor" "--sysroot" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\stage2" "--target=x86_64-pc-windows-gnu" "--check-cfg" "cfg(FALSE)" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "-C" "prefer-dynamic" "--out-dir" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\test\\ui\\sepcomp\\sepcomp-lib-lto\\auxiliary" "-A" "internal_features" "-Crpath" "-Lnative=C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\native\\rust-test-helpers" "-C" "codegen-units=3" "--crate-type=rlib,dylib" "-g" "--crate-type" "dylib" "-L" "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-gnu\\test\\ui\\sepcomp\\sepcomp-lib-lto\\auxiliary"
--- stderr -------------------------------
--- stderr -------------------------------
error: couldn't create a temp dir: Access is denied. (os error 5) at path "C:\\a\\_temp\\msys64\\tmp\\rustcPOlAdi"
error: aborting due to 1 previous error
------------------------------------------


@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 24, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 24, 2024
@Amanieu
Copy link
Copy Markdown
Member

Amanieu commented Jun 24, 2024

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 24, 2024
@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 25, 2024

⌛ Testing commit ec9e356 with merge fc555cd...

@ivan-shrimp
Copy link
Copy Markdown
Contributor

It seems that this actually pessimizes the result; see https://rust.godbolt.org/z/c1qebWbxM.

LLVM is not good at recombining the icmp and add/sub. Notice that the original code correctly takes advantage of the zero flag from the inc instruction, while the new code needs to do the cmp and inc separately.

(The corresponding PR for checked_sub had a similar problem, and I submitted a fix for it.)

@bors
Copy link
Copy Markdown
Collaborator

bors commented Jun 25, 2024

☀️ Test successful - checks-actions
Approved by: Amanieu
Pushing fc555cd to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 25, 2024
@bors bors merged commit fc555cd into rust-lang:master Jun 25, 2024
@rustbot rustbot added this to the 1.81.0 milestone Jun 25, 2024
@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (fc555cd): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.6% [0.3%, 0.9%] 4
Regressions ❌
(secondary)
0.4% [0.3%, 0.4%] 2
Improvements ✅
(primary)
-0.3% [-0.3%, -0.3%] 1
Improvements ✅
(secondary)
-1.3% [-1.4%, -0.9%] 7
All ❌✅ (primary) 0.4% [-0.3%, 0.9%] 5

Max RSS (memory usage)

Results (primary 1.4%, secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
9.6% [5.6%, 13.6%] 2
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
-6.8% [-8.6%, -5.0%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.4% [-8.6%, 13.6%] 4

Cycles

Results (primary 3.9%, secondary -3.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.9% [3.9%, 3.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.5% [-3.9%, -3.1%] 2
All ❌✅ (primary) 3.9% [3.9%, 3.9%] 1

Binary size

Results (primary -0.1%, secondary -0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.6%] 6
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 2
Improvements ✅
(primary)
-0.1% [-0.6%, -0.0%] 53
Improvements ✅
(secondary)
-0.1% [-0.8%, -0.0%] 42
All ❌✅ (primary) -0.1% [-0.6%, 0.6%] 59

Bootstrap: 693.576s -> 693.565s (-0.00%)
Artifact size: 326.73 MiB -> 326.74 MiB (0.00%)

@scottmcm scottmcm deleted the more-checked-math-tweaks branch June 29, 2024 08:40
@pnkfelix
Copy link
Copy Markdown
Contributor

pnkfelix commented Jul 2, 2024

It seems that this actually pessimizes the result; see https://rust.godbolt.org/z/c1qebWbxM.

LLVM is not good at recombining the icmp and add/sub. Notice that the original code correctly takes advantage of the zero flag from the inc instruction, while the new code needs to do the cmp and inc separately.

(The corresponding PR for checked_sub had a similar problem, and I submitted a fix for it.)

Hey @scottmcm , did you see this comment about this PR? Do you have thoughts about the claim that this is a pessimization, at least on x86 it seems?

@pnkfelix
Copy link
Copy Markdown
Contributor

pnkfelix commented Jul 2, 2024

visiting for weekly rustc-perf triage

  • PR was analyzed and thought to be a net win, despite the anticipated regression to compiler instruction-counts
  • but there was a bystander follow-up comment that the result here might be a pessimization, at least for Intel x86.
  • not marking as triaged, in hopes that follow-up comment gets addressed in some manner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants