Skip to content

Conversation

@AaronChen0
Copy link
Contributor

@AaronChen0 AaronChen0 commented May 24, 2024

Lt is branchless and can be inlined, while Cmp has two if branches and can not be inlined. So Lt is better.

To check if a simple function can be inlined or not in golang, run

go build -gcflags='-m -m' |& grep Lt

Learn this trick a few days ago.

Benchmark

goos: linux
goarch: amd64
pkg: github.com/holiman/uint256
cpu: AMD Ryzen 7 7735H with Radeon Graphics         
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Mod/small/uint256-16       4.041n ± 2%   2.865n ± 1%  -29.10% (p=0.000 n=10)
Mod/mod64/uint256-16       22.68n ± 1%   21.39n ± 2%   -5.69% (p=0.000 n=10)
Mod/mod128/uint256-16      39.82n ± 1%   38.23n ± 1%   -4.01% (p=0.000 n=10)
Mod/mod192/uint256-16      36.37n ± 1%   34.69n ± 2%   -4.62% (p=0.000 n=10)
Mod/mod256/uint256-16      28.33n ± 1%   27.08n ± 1%   -4.39% (p=0.000 n=10)
DivMod/small/uint256-16    4.445n ± 1%   3.121n ± 3%  -29.80% (p=0.000 n=10)
DivMod/mod64/uint256-16    22.95n ± 1%   22.12n ± 2%   -3.64% (p=0.000 n=10)
DivMod/mod128/uint256-16   40.21n ± 2%   39.35n ± 0%   -2.16% (p=0.001 n=10)
DivMod/mod192/uint256-16   36.58n ± 2%   35.75n ± 1%   -2.26% (p=0.000 n=10)
DivMod/mod256/uint256-16   28.75n ± 1%   27.68n ± 2%   -3.70% (p=0.000 n=10)
AddMod/small/uint256-16    6.158n ± 2%   4.748n ± 0%  -22.90% (p=0.000 n=10)
AddMod/mod64/uint256-16    9.024n ± 2%   7.413n ± 1%  -17.86% (p=0.000 n=10)
AddMod/mod128/uint256-16   18.00n ± 2%   15.86n ± 1%  -11.89% (p=0.000 n=10)
AddMod/mod192/uint256-16   19.78n ± 2%   17.61n ± 2%  -10.97% (p=0.000 n=10)
AddMod/mod256/uint256-16   6.416n ± 2%   6.422n ± 1%        ~ (p=0.928 n=10)
geomean                    16.63n        14.84n       -10.76%

@AaronChen0 AaronChen0 changed the title uint256: optimize mod, DivMod uint256: optimize Mod, DivMod May 24, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (70cbe2b) to head (1d336e5).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #173   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            5         5           
  Lines         1632      1628    -4     
=========================================
- Hits          1632      1628    -4     

@AaronChen0 AaronChen0 changed the title uint256: optimize Mod, DivMod uint256: optimize Mod, DivMod, AddMod May 25, 2024
Copy link
Owner

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@holiman holiman merged commit 8dfcfde into holiman:master May 27, 2024
@holiman
Copy link
Owner

holiman commented May 27, 2024

Well done!

@AaronChen0 AaronChen0 deleted the mod branch May 27, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants