Commit fc06455
Optimize LIKE for more relaxed patterns (apache#8594)
Summary:
In this PR we optimize LIKE operations for patterns which I call them
kRelaxed[Prefix|Suffix] patterns, e.g.
- kRelaxedPrefix: _a_bc%%
- kRelaxedSuffix: %%_a_bc
'Relaxed' here means there is less restrictions than their counterparts.
The algorithm of recognizing these relaxed patterns can be explained by
an example, say we have a pattern '___hello___%%', it is split into 4
sub patterns:
- [0] kSingleCharWildcard: ___
- [1] kLiteralString: hello
- [2] kSingleCharWildcard: ___
- [3] kAnyCharsWildcard: %%
Since the 'kAnyCharsWildcard' only occurs at the end of the pattern, we can
determine it is a kRelaxedPrefix pattern, and then use the first 3 fixed
sub-patterns to do the matching.
The benchmark result:
Before(kGeneric):
```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s
============================================================================
like_generic##like_generic 1.34s 747.38m
----------------------------------------------------------------------------
----------------------------------------------------------------------------
like_prefix##like_prefix 340.30ms 2.94
like_prefix##like_relaxed_prefix_1 334.77ms 2.99
like_prefix##like_relaxed_prefix_2 350.70ms 2.85
like_prefix##starts_with 5.35ms 187.05
like_substring##like_substring 1.26s 790.87m
like_substring##strpos 20.55ms 48.67
like_suffix##like_suffix 957.06ms 1.04
like_suffix##like_relaxed_suffix_1 935.90ms 1.07
like_suffix##like_relaxed_suffix_2 1.08s 926.79m
like_suffix##ends_with 5.35ms 187.07
```
After(kRelaxedPrefix, kRelaxedSuffix):
```
============================================================================
[...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s
============================================================================
like_generic##like_generic 1.48s 674.92m
----------------------------------------------------------------------------
----------------------------------------------------------------------------
like_prefix##like_prefix 7.05ms 141.80
like_prefix##like_relaxed_prefix_1 9.06ms 110.36
like_prefix##like_relaxed_prefix_2 8.55ms 116.94
like_prefix##starts_with 5.34ms 187.22
like_substring##like_substring 22.47ms 44.50
like_substring##strpos 20.72ms 48.27
like_suffix##like_suffix 7.05ms 141.82
like_suffix##like_relaxed_suffix_1 9.08ms 110.16
like_suffix##like_relaxed_suffix_2 8.52ms 117.30
like_suffix##ends_with 5.35ms 187.07
```
The speedup for kRelaxedPrefix is about 40x, speedup for kRelaxedSuffix is about 100x.
Pull Request resolved: facebookincubator/velox#8594
Reviewed By: Yuhta
Differential Revision: D53264233
Pulled By: mbasmanova
fbshipit-source-id: 5d5b3b639dbaee83194c840aba0ba7de8f978e771 parent 790d6bd commit fc06455
File tree
4 files changed
+970
-305
lines changed- velox
- benchmarks/basic
- functions/lib
- tests
4 files changed
+970
-305
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
39 | 38 | | |
40 | 39 | | |
41 | 40 | | |
42 | 41 | | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
64 | 74 | | |
65 | 75 | | |
66 | 76 | | |
| 77 | + | |
67 | 78 | | |
| 79 | + | |
68 | 80 | | |
69 | 81 | | |
70 | 82 | | |
71 | | - | |
72 | | - | |
| 83 | + | |
| 84 | + | |
73 | 85 | | |
74 | 86 | | |
75 | 87 | | |
76 | 88 | | |
77 | | - | |
78 | | - | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
79 | 99 | | |
80 | 100 | | |
81 | 101 | | |
82 | 102 | | |
83 | | - | |
84 | | - | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
85 | 113 | | |
86 | 114 | | |
87 | 115 | | |
88 | 116 | | |
89 | | - | |
90 | | - | |
| 117 | + | |
| 118 | + | |
91 | 119 | | |
92 | 120 | | |
93 | 121 | | |
| |||
0 commit comments