Skip to content

perf: use get_unchecked for TwoWaySearcher#155607

Open
KowalskiThomas wants to merge 1 commit intorust-lang:mainfrom
KowalskiThomas:kowalski/perf-use-get_unchecked-for-pattern
Open

perf: use get_unchecked for TwoWaySearcher#155607
KowalskiThomas wants to merge 1 commit intorust-lang:mainfrom
KowalskiThomas:kowalski/perf-use-get_unchecked-for-pattern

Conversation

@KowalskiThomas
Copy link
Copy Markdown

@KowalskiThomas KowalskiThomas commented Apr 21, 2026

What is this PR?

This is related to #27721.

This PR is a proposal for a performance improvement in std::pattern.

Profiling of https://github.com/quickwit-oss/quickwit in production shows that TwoWaySearcher::next is one of the most CPU-time-consuming functions, so I thought I would give it a look.
I read the contribution guide and this seems to be a fitting proposal.

It seems like TwoWaySearcher::next and TwoWaySearcher::next_back could be made faster by using get_unchecked in the inner loop comparisons instead of regular indexing, which is safe in the conditions where it would be done (indices are within bounds by construction).
I added some SAFETY comments in the code to explain why this is safe, as I believe is customary in those cases (and according to this page as well).

Benchmarks

I ran the existing bencharmks before/after the changes (only on my laptop, I can run them in other places if that's necessary).

./x.py bench library/coretests -- pattern::

We seem to be getting a ~7.5-12% performance improvement at a very low cost, which sounds worthwhile to me.
But this is the first time I'm proposing a change in Rust, so I'm looking forward to feedback on this.

BEFORE CHANGES
    pattern::ends_with_char   3398.91ns/iter +/- 526.28
    pattern::ends_with_str    3545.04ns/iter +/- 1108.76
    pattern::starts_with_char 3348.31ns/iter +/- 352.38
    pattern::starts_with_str  3710.59ns/iter +/- 435.57

AFTER CHANGES
    pattern::ends_with_char   3125.99ns/iter +/- 567.09  (-8.03%)
    pattern::ends_with_str    3106.43ns/iter +/- 258.33  (-12.38%)
    pattern::starts_with_char 3094.55ns/iter +/- 595.42  (-7.59%)
    pattern::starts_with_str  3365.75ns/iter +/- 268.88  (-9.29%)

System info for the benchmarks run

Details
Based on commit 8317fef20409adedaa7c385fa6e954867bf626fc

rustc 1.96.0-dev
binary: rustc
commit-hash: unknown
commit-date: unknown
host: aarch64-apple-darwin
release: 1.96.0-dev
LLVM version: 22.1.2

Apple M4 Max
16
64 GB
ProductName:		macOS
ProductVersion:		26.3
BuildVersion:		25D125
(this was run on AC and without any heavy load from other apps or whatnot)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 21, 2026
@rust-log-analyzer

This comment has been minimized.

@KowalskiThomas KowalskiThomas force-pushed the kowalski/perf-use-get_unchecked-for-pattern branch from 6d37582 to 40692f5 Compare April 21, 2026 20:17
@KowalskiThomas KowalskiThomas marked this pull request as ready for review April 21, 2026 21:39
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 21, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 21, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 21, 2026

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @scottmcm, libs
  • @scottmcm, libs expanded to 7 candidates
  • Random selection from Mark-Simulacrum, jhpratt, scottmcm

Copy link
Copy Markdown
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review past the first block. In general it seems promising that you're seeing good perf improvement but I'm also hesitant to approve unsafe code in these fairly tricky string searching routines. I'm not sure how thorough our test coverage is on boundary conditions that would have caused panics in the indexing before... at minimum I don't think we have any fuzzing which would normally help build confidence.

cc @BurntSushi, I'm wondering if the out-of-tree(?) implementations for ripgrep/regex of the algorithms here have these optimizations applied?

View changes since this review

// `self.position + i < haystack.len()`.
// Every path that mutates `self.position` below either returns or re-enters `'search`,
// which re-runs the check before reaching the loop again.
// `i < needle.len()` also guarantees `needle.get_unchecked(i)` is safe.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you benchmark just changing the haystack indexing? It seems like LLVM should be able to make use of the for loops range being bounded by needle.len() to avoid that indexing actually staying in the final program.

The haystack bit (self.position + needle.len() <= haystack.len()) doesn't seem quite sufficient to me without deeper scrutiny of the underlying algorithm, since self.position is getting updated by a non-obviously +1 on each iteration... and if it was a +1 on each iteration, I'd guess LLVM would be able to simplify this itself?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll give that a go!

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 26, 2026
@KowalskiThomas
Copy link
Copy Markdown
Author

@Mark-Simulacrum Thanks for taking a look! I appreciate it.

I'm not sure how thorough our test coverage is on boundary conditions that would have caused panics in the indexing before... at minimum I don't think we have any fuzzing which would normally help build confidence.

Is there anything I could do to increase this confidence (regarding testing on edge cases and/or fuzzing)? If that requires additional work somewhere else and you're open to me contributing there, I'm definitely open to.

As I mentioned in the description, this function is currently a hotspot of ours and we'd really like to see it go faster.
Of course we could consider the rewrite route but that has obvious downsides of its own and they probably are way worse (for us) than doing what is needed to make the reference implementation faster (on top of making things better for others as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants