Skip to content

Rewrite LIKE expressions to range searches #71421

@rschu1ze

Description

@rschu1ze

To avoid regexp pattern matching, optimizers in many databases rewrite LIKE patterns of the form x = 'abc%' (prefix patterns) into x >= 'abc' AND x < 'abd'. Prefix patterns occur e.g. in TPC-H queries Q14, Q16 and Q20. Similar optimizations can be done for ILIKE and NOT (I)LIKE.

Rewriting is already done during primary key analysis (

"like",
[] (RPNElement & out, const Field & value)
{
if (value.getType() != Field::Types::String)
return false;
String prefix = extractFixedPrefixFromLikePattern(value.safeGet<const String &>(), /*requires_perfect_prefix*/ false);
if (prefix.empty())
return false;
String right_bound = firstStringThatIsGreaterThanAllStringsWithPrefix(prefix);
out.function = RPNElement::FUNCTION_IN_RANGE;
out.range = !right_bound.empty()
? Range(prefix, true, right_bound, false)
: Range::createLeftBounded(prefix, true);
return true;
}
},
{
"notLike",
[] (RPNElement & out, const Field & value)
{
if (value.getType() != Field::Types::String)
return false;
String prefix = extractFixedPrefixFromLikePattern(value.safeGet<const String &>(), /*requires_perfect_prefix*/ true);
if (prefix.empty())
return false;
String right_bound = firstStringThatIsGreaterThanAllStringsWithPrefix(prefix);
out.function = RPNElement::FUNCTION_NOT_IN_RANGE;
out.range = !right_bound.empty()
? Range(prefix, true, right_bound, false)
: Range::createLeftBounded(prefix, true);
return true;
}
},
) but it is not performed for ordinary scans.

Before implementing this item, it is worth to check potential performance improvements in TPC-H (i.e. manually rewrite Q14, Q16, Q20).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions