Skip to content

Conversation

@FidelSch
Copy link
Contributor

@FidelSch FidelSch commented Dec 3, 2025

Reading #6182 I noticed that most of the time spent running cargo run seq 4e4000003 4e4000003 was just BigUint::to_string()

Reading the code I found it is being called twice, once to get the actual string representation of the first number, and again on the last number, but only to get its length; and discarding the actual result.
This seems fine on fairly small numbers, but its efficiency degrades significantly on larger ones.

In my machine, this change resulted in a ~2x speedup on the mentioned case, and a marginal but seemingly better performance on smaller cases.

$ ~/coreutils$ hyperfine -L seq target/release/coreutils,target/release/coreutils_old "{seq} seq 4e4000003 4e4000003" 
Benchmark 1: target/release/coreutils seq 4e4000003 4e4000003
  Time (mean ± σ):     26.009 s ±  0.113 s    [User: 25.992 s, System: 0.015 s]
  Range (min … max):   25.909 s … 26.294 s    10 runs
 
Benchmark 2: target/release/coreutils_old seq 4e4000003 4e4000003
  Time (mean ± σ):     52.372 s ±  0.446 s    [User: 52.352 s, System: 0.017 s]
  Range (min … max):   51.815 s … 53.017 s    10 runs
 
Summary
  'target/release/coreutils seq 4e4000003 4e4000003' ran
    2.01 ± 0.02 times faster than 'target/release/coreutils_old seq 4e4000003 4e4000003'

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

@ChrisDryden
Copy link
Contributor

Would be great to add your example to the benchmarks

@sylvestre
Copy link
Contributor

Would be great to add your example to the benchmarks

In a separate pr please :)

Copy link
Contributor

@anastygnome anastygnome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for not using

n.checked_ilog10().unwrap_or(0) + 1

Which should be available?

@FidelSch
Copy link
Contributor Author

FidelSch commented Dec 4, 2025

Any reason for not using

n.checked_ilog10().unwrap_or(0) + 1

Seemed unnecessary given it is just a constant. If there is any benefit to this alternative I am happy to refactor.

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 6, 2025

CodSpeed Performance Report

Merging #9557 will improve performances by 58.87%

Comparing FidelSch:seq-optimization (7b79b27) with main (7c62885)

Summary

⚡ 1 improvement
✅ 126 untouched
⏩ 6 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
seq_large_integers 2.2 ms 1.4 ms +58.87%

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sylvestre
Copy link
Contributor

any idea why tsort_input_parsing_heavy regressed ?

@github-actions
Copy link

github-actions bot commented Dec 6, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/tty/tty-eof is no longer failing!

@github-actions
Copy link

github-actions bot commented Dec 6, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@anastygnome
Copy link
Contributor

Any reason for not using

n.checked_ilog10().unwrap_or(0) + 1

Seemed unnecessary given it is just a constant. If there is any benefit to this alternative I am happy to refactor.

If I recall correctly the performance is similar to your solution and it's part of the language. Could you try sending a commit to trigger the benchmarks with this solution instead?

@sylvestre
Copy link
Contributor

note, upstream fails this way - what do you think we should do here?

$ LANG=C /usr/bin/seq "4e4000003" "4e4000003"
seq: invalid floating point argument: '4e4000003'
Try '/usr/bin/seq --help' for more information.

@FidelSch
Copy link
Contributor Author

After some digging, the maximum value seq accepts seems to be 11e4931, equivalent to the maximum value representable by an 80-bit long double; which supports the theory that GNU uses this type for their implementation.
By using BigDecimal we are significantly extending the representable range, so it makes sense that for huge numbers a direct comparison to seq is not viable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants