-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add ArrayAgg aggregation function #11822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5c7b1e2 to
e50b66b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #11822 +/- ##
============================================
- Coverage 63.02% 62.88% -0.14%
- Complexity 1140 1141 +1
============================================
Files 2351 2367 +16
Lines 127324 127888 +564
Branches 19611 19732 +121
============================================
+ Hits 80244 80424 +180
- Misses 41373 41733 +360
- Partials 5707 5731 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably consider how to parse this. b/c the proper way to express distinct array agg is
SELECT key, ARRAY_AGG(DISTINCT col)
FROM tbl
GROUP BY key
given that our functions are not standard it is fine we do it now but good to note this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, I was also thinking about this, however it's not parsed in v1 query engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @Jackie-Jiang this is related to #11518
bd2560b to
afbd1c1
Compare
pinot-core/src/main/java/org/apache/pinot/core/common/ObjectSerDeUtils.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/apache/pinot/core/operator/blocks/results/AggregationResultsBlock.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/pinot/core/query/aggregation/function/array/ArrayAggFunction.java
Outdated
Show resolved
Hide resolved
.../java/org/apache/pinot/core/query/aggregation/function/array/ArrayAggBaseDoubleFunction.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/pinot/core/query/aggregation/function/array/ArrayAggFunction.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionUtils.java
Outdated
Show resolved
Hide resolved
refactor more to make code cleaner
| for (int i = 0; i < length; i++) { | ||
| if (!nullBitmap.contains(i)) { | ||
| setGroupByResult(groupByResultHolder, groupKeyArray[i], values[i]); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be postgres compatible. Documentation says:
array_agg ( anynonarray ) → anyarray
Collects all the input values, including nulls, into an array.
And in fact that is how it behaves
Example:
postgres=# insert into A(a) values (1), (2), (3), (null), (5);
INSERT 0 5
postgres=# select array_agg(a) from A;
array_agg
----------------
{1,2,3,NULL,5}
(1 row)
Add aggregation function:
ArrayAgg.ArrayAgg(column, 'dataType' [, 'isDistinct'])BOOLEAN,INT,LONG,FLOAT(only in V1),DOUBLE,STRING,TIMESTAMP.E.g.
ArrayAgg(intCol, 'INT')returnsARRAY<INT>For implementation, separated the implementations per data type:
ArrayAggFunction,ArrayAggBase<type>Function,ArrayAggDistinct<type>Function,ArrayAgg<type>FunctionNote: Float type column is treated as Double in v2 engine, so
FLOATtype is not supported.Release Note