perf: use get_unchecked for TwoWaySearcher#155607
perf: use get_unchecked for TwoWaySearcher#155607KowalskiThomas wants to merge 1 commit intorust-lang:mainfrom
get_unchecked for TwoWaySearcher#155607Conversation
This comment has been minimized.
This comment has been minimized.
6d37582 to
40692f5
Compare
|
rustbot has assigned @Mark-Simulacrum. Use Why was this reviewer chosen?The reviewer was selected based on:
|
There was a problem hiding this comment.
Didn't review past the first block. In general it seems promising that you're seeing good perf improvement but I'm also hesitant to approve unsafe code in these fairly tricky string searching routines. I'm not sure how thorough our test coverage is on boundary conditions that would have caused panics in the indexing before... at minimum I don't think we have any fuzzing which would normally help build confidence.
cc @BurntSushi, I'm wondering if the out-of-tree(?) implementations for ripgrep/regex of the algorithms here have these optimizations applied?
| // `self.position + i < haystack.len()`. | ||
| // Every path that mutates `self.position` below either returns or re-enters `'search`, | ||
| // which re-runs the check before reaching the loop again. | ||
| // `i < needle.len()` also guarantees `needle.get_unchecked(i)` is safe. |
There was a problem hiding this comment.
Can you benchmark just changing the haystack indexing? It seems like LLVM should be able to make use of the for loops range being bounded by needle.len() to avoid that indexing actually staying in the final program.
The haystack bit (self.position + needle.len() <= haystack.len()) doesn't seem quite sufficient to me without deeper scrutiny of the underlying algorithm, since self.position is getting updated by a non-obviously +1 on each iteration... and if it was a +1 on each iteration, I'd guess LLVM would be able to simplify this itself?
There was a problem hiding this comment.
Sure, I'll give that a go!
|
@Mark-Simulacrum Thanks for taking a look! I appreciate it.
Is there anything I could do to increase this confidence (regarding testing on edge cases and/or fuzzing)? If that requires additional work somewhere else and you're open to me contributing there, I'm definitely open to. As I mentioned in the description, this function is currently a hotspot of ours and we'd really like to see it go faster. |
What is this PR?
This is related to #27721.
This PR is a proposal for a performance improvement in
std::pattern.Profiling of https://github.com/quickwit-oss/quickwit in production shows that
TwoWaySearcher::nextis one of the most CPU-time-consuming functions, so I thought I would give it a look.I read the contribution guide and this seems to be a fitting proposal.
It seems like
TwoWaySearcher::nextandTwoWaySearcher::next_backcould be made faster by usingget_uncheckedin the inner loop comparisons instead of regular indexing, which is safe in the conditions where it would be done (indices are within bounds by construction).I added some
SAFETYcomments in the code to explain why this is safe, as I believe is customary in those cases (and according to this page as well).Benchmarks
I ran the existing bencharmks before/after the changes (only on my laptop, I can run them in other places if that's necessary).
We seem to be getting a ~7.5-12% performance improvement at a very low cost, which sounds worthwhile to me.
But this is the first time I'm proposing a change in Rust, so I'm looking forward to feedback on this.
System info for the benchmarks run
Details