Port ArrayHas family to functions-array#9496
Conversation
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
|
Contains #9477 |
| } | ||
| } | ||
|
|
||
| #[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
@guojidan I'm thinking of importing udfs with feature flag like this.
There was a problem hiding this comment.
if user disable array_expressions featrue, and implement array function by themselves, this analyzer will not be called 😕
There was a problem hiding this comment.
This rule is disabled, but other analyzer rules are not. As long as array_expressions is disabled, related function should be disabled too. Even they implement their own array functions, those functions are not equal to our udf based array functions, so it makes sense to me that the rewrite rule is disabled.
There was a problem hiding this comment.
So it seems like ideally that users who want to supply their own implementations for array_expressions should also supply their own Analyzer passes.
Following that logic, it seems like the datafusion_array_functions crate should supply its own analyzer pass that had array function specific rewrites.
Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
| #[cfg(feature = "array_expressions")] | |
| // Note This rewrite is only done if the built in DataFusion `array_expressions` feature is enabled. | |
| // Even if users implement their own array functions, those functions are not equal to the DataFusion | |
| // udf based array functions, so this rewrite is not corrrect | |
| #[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
So it seems like ideally that users who want to supply their own implementations for
array_expressionsshould also supply their own Analyzer passes.Following that logic, it seems like the
datafusion_array_functionscrate should supply its own analyzer pass that had array function specific rewrites.Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
I'm not sure if there are any concrete cases, but if they can have an analyzer that not only deals with array_expressions but also other expressions that would be more flexible.
However, the first step is to build an user-defined analyzer rule and find a way to register them to existing rules.
There was a problem hiding this comment.
I filed #9519
(it turns out that you can already define AnalyzerRules). I will make a PR to improve the documentation around this
alamb
left a comment
There was a problem hiding this comment.
Thank you @jayzhan211 -- I think we could merge this PR in and then file tickets for moving the array_functions specific analyzer rules into the datafusion_functions_array crate as well (see comments)
If you agree I will file the tickets
Thanks again
| } | ||
| } | ||
|
|
||
| #[cfg(feature = "array_expressions")] |
There was a problem hiding this comment.
So it seems like ideally that users who want to supply their own implementations for array_expressions should also supply their own Analyzer passes.
Following that logic, it seems like the datafusion_array_functions crate should supply its own analyzer pass that had array function specific rewrites.
Does that make sense @jayzhan211 and @guojidan?
If so, I will file a ticket to track the request / work.
Note that since I don't think users can actually add their own Analyzer passes today, we can't implement this idea yet. Thus I think we can still merge this PR.
I also think it would help to explain this rationale in comments. For example
| #[cfg(feature = "array_expressions")] | |
| // Note This rewrite is only done if the built in DataFusion `array_expressions` feature is enabled. | |
| // Even if users implement their own array functions, those functions are not equal to the DataFusion | |
| // udf based array functions, so this rewrite is not corrrect | |
| #[cfg(feature = "array_expressions")] |
Co-authored-by: Andrew Lamb <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
Part of #9285
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?