perf: use `get_unchecked` for `TwoWaySearcher` by KowalskiThomas · Pull Request #155607 · rust-lang/rust

KowalskiThomas · 2026-04-21T17:36:12Z

What is this PR?

This is related to #27721.

This PR is a proposal for a performance improvement in std::pattern.

Profiling of https://github.com/quickwit-oss/quickwit in production shows that TwoWaySearcher::next is one of the most CPU-time-consuming functions, so I thought I would give it a look.
I read the contribution guide and this seems to be a fitting proposal.

It seems like TwoWaySearcher::next and TwoWaySearcher::next_back could be made faster by using get_unchecked in the inner loop comparisons instead of regular indexing, which is safe in the conditions where it would be done (indices are within bounds by construction).
I added some SAFETY comments in the code to explain why this is safe, as I believe is customary in those cases (and according to this page as well).

Benchmarks

I ran the existing bencharmks before/after the changes (only on my laptop, I can run them in other places if that's necessary).

./x.py bench library/coretests -- pattern::

We seem to be getting a ~7.5-12% performance improvement at a very low cost, which sounds worthwhile to me.
But this is the first time I'm proposing a change in Rust, so I'm looking forward to feedback on this.

BEFORE CHANGES
    pattern::ends_with_char   3398.91ns/iter +/- 526.28
    pattern::ends_with_str    3545.04ns/iter +/- 1108.76
    pattern::starts_with_char 3348.31ns/iter +/- 352.38
    pattern::starts_with_str  3710.59ns/iter +/- 435.57

AFTER CHANGES
    pattern::ends_with_char   3125.99ns/iter +/- 567.09  (-8.03%)
    pattern::ends_with_str    3106.43ns/iter +/- 258.33  (-12.38%)
    pattern::starts_with_char 3094.55ns/iter +/- 595.42  (-7.59%)
    pattern::starts_with_str  3365.75ns/iter +/- 268.88  (-9.29%)

System info for the benchmarks run

Details

Based on commit 8317fef20409adedaa7c385fa6e954867bf626fc

rustc 1.96.0-dev
binary: rustc
commit-hash: unknown
commit-date: unknown
host: aarch64-apple-darwin
release: 1.96.0-dev
LLVM version: 22.1.2

Apple M4 Max
16
64 GB
ProductName:		macOS
ProductVersion:		26.3
BuildVersion:		25D125
(this was run on AC and without any heavy load from other apps or whatnot)

rustbot · 2026-04-21T21:39:19Z

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: @scottmcm, libs
@scottmcm, libs expanded to 7 candidates
Random selection from Mark-Simulacrum, jhpratt, scottmcm

Mark-Simulacrum

Didn't review past the first block. In general it seems promising that you're seeing good perf improvement but I'm also hesitant to approve unsafe code in these fairly tricky string searching routines. I'm not sure how thorough our test coverage is on boundary conditions that would have caused panics in the indexing before... at minimum I don't think we have any fuzzing which would normally help build confidence.

cc @BurntSushi, I'm wondering if the out-of-tree(?) implementations for ripgrep/regex of the algorithms here have these optimizations applied?

View changes since this review

Mark-Simulacrum · 2026-04-26T16:29:12Z

+                // `self.position + i < haystack.len()`.
+                // Every path that mutates `self.position` below either returns or re-enters `'search`,
+                // which re-runs the check before reaching the loop again.
+                // `i < needle.len()` also guarantees `needle.get_unchecked(i)` is safe.


Can you benchmark just changing the haystack indexing? It seems like LLVM should be able to make use of the for loops range being bounded by needle.len() to avoid that indexing actually staying in the final program.

The haystack bit (self.position + needle.len() <= haystack.len()) doesn't seem quite sufficient to me without deeper scrutiny of the underlying algorithm, since self.position is getting updated by a non-obviously +1 on each iteration... and if it was a +1 on each iteration, I'd guess LLVM would be able to simplify this itself?

Sure, I'll give that a go!

KowalskiThomas · 2026-04-26T18:57:41Z

@Mark-Simulacrum Thanks for taking a look! I appreciate it.

I'm not sure how thorough our test coverage is on boundary conditions that would have caused panics in the indexing before... at minimum I don't think we have any fuzzing which would normally help build confidence.

Is there anything I could do to increase this confidence (regarding testing on edge cases and/or fuzzing)? If that requires additional work somewhere else and you're open to me contributing there, I'm definitely open to.

As I mentioned in the description, this function is currently a hotspot of ours and we'd really like to see it go faster.
Of course we could consider the rewrite route but that has obvious downsides of its own and they probably are way worse (for us) than doing what is needed to make the reference implementation faster (on top of making things better for others as well).

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 21, 2026

This comment has been minimized.

Sign in to view

perf: use get_unchecked for TwoWaySearcher

40692f5

KowalskiThomas force-pushed the kowalski/perf-use-get_unchecked-for-pattern branch from 6d37582 to 40692f5 Compare April 21, 2026 20:17

KowalskiThomas marked this pull request as ready for review April 21, 2026 21:39

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 21, 2026

rustbot assigned Mark-Simulacrum Apr 21, 2026

rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 21, 2026

Mark-Simulacrum reviewed Apr 26, 2026

View reviewed changes

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: use `get_unchecked` for `TwoWaySearcher`#155607

perf: use `get_unchecked` for `TwoWaySearcher`#155607
KowalskiThomas wants to merge 1 commit intorust-lang:mainfrom
KowalskiThomas:kowalski/perf-use-get_unchecked-for-pattern

KowalskiThomas commented Apr 21, 2026 •

edited by rustbot

Loading

Uh oh!

This comment has been minimized.

rustbot commented Apr 21, 2026

Uh oh!

Mark-Simulacrum left a comment •

edited by rustbot

Loading

Uh oh!

Mark-Simulacrum Apr 26, 2026

Uh oh!

KowalskiThomas Apr 26, 2026

Uh oh!

KowalskiThomas commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

KowalskiThomas commented Apr 21, 2026 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this PR?

Benchmarks

Uh oh!

This comment has been minimized.

rustbot commented Apr 21, 2026

Uh oh!

Mark-Simulacrum left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mark-Simulacrum Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

KowalskiThomas Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

KowalskiThomas commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KowalskiThomas commented Apr 21, 2026 •

edited by rustbot

Loading

Mark-Simulacrum left a comment •

edited by rustbot

Loading