Skip to content

Conversation

@fyfyrchik
Copy link
Contributor

Hello, thanks for the awesome library!

I was looking for something to optimize, so here it is:
Bit twiddling hacks page has simple algorithm for calculating log10 of an integer:
https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10

Implement it here.

   goos: linux
   goarch: amd64
   pkg: github.com/holiman/uint256
   cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                         │     old      │                 new                 │
                         │    sec/op    │   sec/op     vs base                │
   Log10/Log10/uint256-8   1016.5n ± 6%   762.9n ± 1%  -24.95% (p=0.000 n=10)
   Log10/Log10/big-8        27.66µ ± 5%   28.56µ ± 5%        ~ (p=0.393 n=10)
   geomean                  5.303µ        4.668µ       -11.97%

                         │      old       │                  new                  │
                         │      B/op      │     B/op      vs base                 │
   Log10/Log10/uint256-8     0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
   Log10/Log10/big-8       22.88Ki ± 0%     22.88Ki ± 0%       ~ (p=1.000 n=10) ¹
   geomean                              ²                 +0.00%                ²
   ¹ all samples are equal
   ² summaries must be >0 to compute geomean

                         │     old      │                 new                 │
                         │  allocs/op   │ allocs/op   vs base                 │
   Log10/Log10/uint256-8   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
   Log10/Log10/big-8       510.0 ± 0%     510.0 ± 0%       ~ (p=1.000 n=10) ¹
   geomean                            ²               +0.00%                ²
   ¹ all samples are equal
   ² summaries must be >0 to compute geomean

Signed-off-by: Evgenii Stratonikov <[email protected]>
@fyfyrchik fyfyrchik marked this pull request as draft August 23, 2023 18:40
@fyfyrchik fyfyrchik force-pushed the speedup-log10 branch 2 times, most recently from d8ce79b to da9e95c Compare August 23, 2023 18:42
@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

Merging #141 (0330ce4) into master (e42b95a) will decrease coverage by 0.13%.
Report is 1 commits behind head on master.
The diff coverage is 71.42%.

❗ Current head 0330ce4 differs from pull request most recent head a074b56. Consider uploading reports for the commit a074b56 to get more accurate results

@@             Coverage Diff             @@
##            master     #141      +/-   ##
===========================================
- Coverage   100.00%   99.87%   -0.13%     
===========================================
  Files            5        5              
  Lines         1646     1642       -4     
===========================================
- Hits          1646     1640       -6     
- Misses           0        1       +1     
- Partials         0        1       +1     

@fyfyrchik fyfyrchik marked this pull request as ready for review August 23, 2023 18:43
@fyfyrchik
Copy link
Contributor Author

CI benchmarks agree (about the same ~20% improvement):

BenchmarkLog10/Log10/uint256-36                           969393              1241 ns/op               0 B/op          0 allocs/op

From #136

BenchmarkLog10/Log10/uint256-36                     	  705776	      1597 ns/op	       0 B/op	       0 allocs/op

Copy link
Owner

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpick, otherwise LGTM. Thanks!

Comment on lines +1368 to +1369
if bitlen == 0 {
return 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this special case, since the code below it works equally well for the zero-case.

On one hand, I don't see any improvement in the benchmark by removing it, but on the other hand: if we don't need nor benefit a lot from additional code/special casing, then it's better not to have it.

So I lean towards dropping it... (?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function comment says we return 0 for 0, not -Inf.
I tried to stay compatible and also think Log10(0) = 0 is a good thing because uint size is architecture dependent, so with 0 we have fixed predictable value on every architecture.
Added a test for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we may check t != 0 case in the main if, what do you think?

Copy link
Owner

@holiman holiman Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function comment says we return 0 for 0

Doesn't it? I thought it did. Ah right, it would fall into the clause below..

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really need this special case, since the code below it works equally well for the zero-case.

That statement was wrong. I don't know, might be better to put it back the way it was. That big clause is complicated enough as it is :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed.

@fyfyrchik fyfyrchik force-pushed the speedup-log10 branch 2 times, most recently from 0a76dce to 91b3c56 Compare August 25, 2023 07:08
Bit twiddling hacks page has simpler algorithm.
https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10

Also, add more testcases on boundary integers to catch possible
off-by-one errors.

```
goos: linux
goarch: amd64
pkg: github.com/holiman/uint256
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                      │     old      │                 new                 │
                      │    sec/op    │   sec/op     vs base                │
Log10/Log10/uint256-8   1016.5n ± 6%   762.9n ± 1%  -24.95% (p=0.000 n=10)
Log10/Log10/big-8        27.66µ ± 5%   28.56µ ± 5%        ~ (p=0.393 n=10)
geomean                  5.303µ        4.668µ       -11.97%

                      │      old       │                  new                  │
                      │      B/op      │     B/op      vs base                 │
Log10/Log10/uint256-8     0.000 ± 0%       0.000 ± 0%       ~ (p=1.000 n=10) ¹
Log10/Log10/big-8       22.88Ki ± 0%     22.88Ki ± 0%       ~ (p=1.000 n=10) ¹
geomean                              ²                 +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                      │     old      │                 new                 │
                      │  allocs/op   │ allocs/op   vs base                 │
Log10/Log10/uint256-8   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Log10/Log10/big-8       510.0 ± 0%     510.0 ± 0%       ~ (p=1.000 n=10) ¹
geomean                            ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean
```

Signed-off-by: Evgenii Stratonikov <[email protected]>
Copy link
Owner

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

@holiman holiman changed the title Speedup log10 Optimize Log10() Sep 2, 2023
@holiman holiman merged commit b4f79ca into holiman:master Sep 2, 2023
@holiman holiman mentioned this pull request Sep 2, 2023
@fyfyrchik fyfyrchik deleted the speedup-log10 branch September 5, 2023 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants