-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Rewrite LIKE expressions to range searches #71421
Copy link
Copy link
Closed
Labels
Description
To avoid regexp pattern matching, optimizers in many databases rewrite LIKE patterns of the form x = 'abc%' (prefix patterns) into x >= 'abc' AND x < 'abd'. Prefix patterns occur e.g. in TPC-H queries Q14, Q16 and Q20. Similar optimizations can be done for ILIKE and NOT (I)LIKE.
Rewriting is already done during primary key analysis (
ClickHouse/src/Storages/MergeTree/KeyCondition.cpp
Lines 356 to 396 in 5f1ffae
| "like", | |
| [] (RPNElement & out, const Field & value) | |
| { | |
| if (value.getType() != Field::Types::String) | |
| return false; | |
| String prefix = extractFixedPrefixFromLikePattern(value.safeGet<const String &>(), /*requires_perfect_prefix*/ false); | |
| if (prefix.empty()) | |
| return false; | |
| String right_bound = firstStringThatIsGreaterThanAllStringsWithPrefix(prefix); | |
| out.function = RPNElement::FUNCTION_IN_RANGE; | |
| out.range = !right_bound.empty() | |
| ? Range(prefix, true, right_bound, false) | |
| : Range::createLeftBounded(prefix, true); | |
| return true; | |
| } | |
| }, | |
| { | |
| "notLike", | |
| [] (RPNElement & out, const Field & value) | |
| { | |
| if (value.getType() != Field::Types::String) | |
| return false; | |
| String prefix = extractFixedPrefixFromLikePattern(value.safeGet<const String &>(), /*requires_perfect_prefix*/ true); | |
| if (prefix.empty()) | |
| return false; | |
| String right_bound = firstStringThatIsGreaterThanAllStringsWithPrefix(prefix); | |
| out.function = RPNElement::FUNCTION_NOT_IN_RANGE; | |
| out.range = !right_bound.empty() | |
| ? Range(prefix, true, right_bound, false) | |
| : Range::createLeftBounded(prefix, true); | |
| return true; | |
| } | |
| }, |
Before implementing this item, it is worth to check potential performance improvements in TPC-H (i.e. manually rewrite Q14, Q16, Q20).
Reactions are currently unavailable