Speed up replaceRegexp(All|One) if the pattern is trivial#66185
Merged
rschu1ze merged 2 commits intoClickHouse:masterfrom Jul 8, 2024
Merged
Speed up replaceRegexp(All|One) if the pattern is trivial#66185rschu1ze merged 2 commits intoClickHouse:masterfrom
replaceRegexp(All|One) if the pattern is trivial#66185rschu1ze merged 2 commits intoClickHouse:masterfrom
Conversation
replaceRegexp(All|One) if the pattern is trivial
Contributor
|
This is an automated comment for commit 948565f with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
divanik
reviewed
Jul 8, 2024
divanik
reviewed
Jul 8, 2024
Member
divanik
left a comment
There was a problem hiding this comment.
static void vectorFixedConstantConstant(...) - I wonder if we can apply the same optimisation here?
This comment was marked as resolved.
This comment was marked as resolved.
divanik
approved these changes
Jul 8, 2024
This comment was marked as resolved.
This comment was marked as resolved.
Member
Author
|
Perf test failure is because test replaceRegexpFallback only been introduced with this PR. |
18 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The idea is to transform
replaceRegexp(One|All)intoreplace(One|All)if the pattern is trivial.This PR is a continuation of #62436 but compared to the original implementation,
ReplaceRegexpImplkeeps using re2 instead of OptimizedRegularExpression to keep the code overall simple and maintainable.Measurements with haystack
Many years later as he faced the firing squad, Colonel Aureliano Buendia was to remember that distant afternoon when his father took him to discover ice.on 5 mio rowsand complex pattern
\s+and replacement\\0\nand trivial pattern
(space) and replacement\n:Courtesy to @taiyang-li !
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Functions
replaceRegexpAllandreplaceRegexpOneare now significantly faster if the pattern is trivial, i.e. contains no metacharacters, pattern classes, flags, grouping characters etc. (Thanks to Taiyang Li)