Datum based like kernels (#4595)#4732
Conversation
arrow-string/src/predicate.rs
Outdated
| /// A string based predicate | ||
| pub enum Predicate<'a> { | ||
| Eq(&'a str), | ||
| IEqAscii(&'a str), |
There was a problem hiding this comment.
maybe it is obvious to others that IEqAscii means "case insensitive ascii equal" but it might help to explicitly explain what each enum here is
|
|
||
| /// Evaluate this predicate against the given haystack | ||
| pub fn evaluate(&self, haystack: &str) -> bool { | ||
| match self { |
There was a problem hiding this comment.
this is pretty neat to mostly use the built in Rust code 👍
| dict_function!("ENDSWITH(left, right)", ends_with_dict, ends_with); | ||
| dict_function!("CONTAINS(left, right)", contains_dict, contains); | ||
|
|
||
| /// Perform SQL `left LIKE right` operation on [`StringArray`] / [`LargeStringArray`]. |
There was a problem hiding this comment.
This documentation and example seems valuable -- could you please port it over to Predicate::like?
| /// | ||
| /// Case insensitive version of [`like_utf8`] | ||
| /// | ||
| /// Note: this only implements loose matching as defined by the Unicode standard. For example, |
There was a problem hiding this comment.
this detail, likewise would be nice to transfer over
| None => r.value(0), | ||
| }; | ||
| op_scalar(op, l, l_v, scalar) | ||
| } else { |
There was a problem hiding this comment.
This is quite a nice formulation ❤️
There was a problem hiding this comment.
Thank you, it took much bashing my head against my keyboard 😆
The performance is likely not spectacular, but I'm also not sure how important it is to optimise the performance of non-scalar like kernels
| ); | ||
|
|
||
| #[test] | ||
| fn test_replace_like_wildcards() { |
There was a problem hiding this comment.
I think these tests should be carried over as tests for fn regex_like(pattern: &str, case_insensitive: bool) -> Result<Regex, ArrowError> {
It is the last of the dyn binary kernels. There are a few kernels, e.g. regex and some boolean kernels, that never had dyn nor scalar variants, but this is the last of the dyn scalar kernels AFAIK |
Do you plan to add such |
I personally don't, but we could probably create tickets and see if someone in the community wishes to pick this up |
I think creating tickets would be a good idea (though of course, I love tickets) |
Which issue does this PR close?
Closes #4595
Closes #2837
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?