ARROW-11822: [Rust][Datafusion] Support case sensitive comparisons for functions and aggregates#9827
Closed
alamb wants to merge 1 commit intoapache:masterfrom
Closed
ARROW-11822: [Rust][Datafusion] Support case sensitive comparisons for functions and aggregates#9827alamb wants to merge 1 commit intoapache:masterfrom
alamb wants to merge 1 commit intoapache:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9827 +/- ##
==========================================
+ Coverage 82.42% 82.43% +0.01%
==========================================
Files 252 252
Lines 58977 59043 +66
==========================================
+ Hits 48609 48674 +65
- Misses 10368 10369 +1
Continue to review full report at Codecov.
|
1440a98 to
e0c73d2
Compare
Contributor
Author
|
fyi @wqc200 @Dandandan @seddonm1 and @returnString |
jorgecarleitao
approved these changes
Apr 5, 2021
Member
jorgecarleitao
left a comment
There was a problem hiding this comment.
LGTM. this beauty in just less than 20 LOC. 💯
returnString
approved these changes
Apr 5, 2021
pachadotdev
pushed a commit
to pachadotdev/arrow
that referenced
this pull request
Apr 6, 2021
…r functions and aggregates Broken out from apache#9600 by @wqc200. Note this does not contain the part of apache#9600 that controls the output display of functions. # Rationale Aggregate functions are checked using case insensitive comparison (e.g. `select MAX(x)` and `select max(x)` both work - the code is [here](https://github.com/apache/arrow/blob/356c300c5ee1e2b23a83652514af11e3a731d596/rust/datafusion/src/physical_plan/aggregates.rs#L75) However, scalar functions, user defined aggregates, and user defined functions, are checked using case sensitive comparisons (e.g. `select sqrt(x)` works while `select SQRT` does not. Postgres always uses case insensitive comparison: ``` alamb=# select sqrt(x) from foo; sqrt ------ (0 rows) alamb=# select SQRT(x) from foo; sqrt ------ (0 rows) ``` # Changes Always use case insensitive comparisons for unquoted identifier comparison, both for consistency within DataFusion as well as consistency with Postgres (and the SQL standard) Adds tests that demonstrate the behavior # Notes This PR changes how user defined functions are resolved in SQL queries. If a user registers two functions with names `"my_sqrt"` and `"MY_SQRT"` previously they could both be called individually. After this PR `my_sqrt` will be called unless the user specifically put `"SQRT"` (in quotes) in their query. Closes apache#9827 from alamb/case_insensitive_functions Authored-by: Andrew Lamb <[email protected]> Signed-off-by: Andrew Lamb <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Broken out from #9600 by @wqc200. Note this does not contain the part of #9600 that controls the output display of functions.
Rationale
Aggregate functions are checked using case insensitive comparison (e.g.
select MAX(x)andselect max(x)both work - the code is hereHowever, scalar functions, user defined aggregates, and user defined functions, are checked using case sensitive comparisons (e.g.
select sqrt(x)works whileselect SQRTdoes not. Postgres always uses case insensitive comparison:Changes
Always use case insensitive comparisons for unquoted identifier comparison, both for consistency within DataFusion as well as consistency with Postgres (and the SQL standard)
Adds tests that demonstrate the behavior
Notes
This PR changes how user defined functions are resolved in SQL queries. If a user registers two functions with names
"my_sqrt"and"MY_SQRT"previously they could both be calledindividually. After this PR
my_sqrtwill be called unless the userspecifically put
"SQRT"(in quotes) in their query.