Skip to content

Conversation

@walterddr
Copy link
Contributor

@walterddr walterddr commented Jul 21, 2023

Description

Support syntax AGG_FUN(...) FILTER (WHERE ...), such as: SELECT sum(val) FILTER (WHERE col > 10) FROM tbl

Follow ups

Major

  1. IS NULL and IS NOT NULL as agg filter is not supported
  2. SQL basic call-binding and Aggregate call-binding has mismatched info regarding filterArg and thus some time create unsupported nullable/non-nullable type during inferReturnType method.
    • this looks like a calcite bug, will do more research
  3. mixed/conflict filter is not supported, for example SELECT COUNT(*) FILTER (WHERE a = 1 OR a = 2) FROM tbl WHERE a = 2 will result in a compilation error
    • occurred on v1 engine as well (specifically, FilterAggregateOperator), can be individually fixed
  4. Pinot behavior is to not return any rows if all of the aggregate filter function has no matches --> this conflicts with standard behavior, which should return a null.
    • require changes in nullability handling in v1 aggregate function behavior

Minor

  1. ServerPlanRequestVisitor: agg with filter and group-by causes conversion issue on v1 if the group-by field is not in the select list -- work around: selecting out all the group-by fields
  2. IN clause not supported: FILTER WHERE clause with IN hard-wired to translate into subquery in calcite for some reason -- work around: use OR and equal

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2023

Codecov Report

Merging #11144 (b18b6fd) into master (4a12ebf) will decrease coverage by 0.01%.
Report is 6 commits behind head on master.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #11144      +/-   ##
==========================================
- Coverage    0.11%    0.11%   -0.01%     
==========================================
  Files        2223     2225       +2     
  Lines      119273   119588     +315     
  Branches    18055    18109      +54     
==========================================
  Hits          137      137              
- Misses     119116   119431     +315     
  Partials       20       20              
Flag Coverage Δ
integration1temurin11 0.00% <0.00%> (ø)
integration1temurin17 0.00% <0.00%> (ø)
integration1temurin20 0.00% <0.00%> (ø)
integration2temurin11 0.00% <0.00%> (ø)
integration2temurin17 0.00% <0.00%> (?)
integration2temurin20 0.00% <0.00%> (ø)
unittests1temurin11 0.00% <0.00%> (ø)
unittests1temurin17 0.00% <0.00%> (ø)
unittests1temurin20 0.00% <0.00%> (ø)
unittests2temurin11 0.11% <0.00%> (-0.01%) ⬇️
unittests2temurin17 ?
unittests2temurin20 0.11% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
.../apache/pinot/common/datablock/DataBlockUtils.java 0.00% <0.00%> (ø)
...pinot/query/parser/CalciteRexExpressionParser.java 0.00% <0.00%> (ø)
...he/pinot/query/planner/plannode/AggregateNode.java 0.00% <0.00%> (ø)
...inot/query/runtime/operator/AggregateOperator.java 0.00% <0.00%> (ø)
...untime/operator/MultistageAggregationExecutor.java 0.00% <0.00%> (ø)
...ry/runtime/operator/MultistageGroupByExecutor.java 0.00% <0.00%> (ø)
.../query/runtime/operator/block/DataBlockValSet.java 0.00% <0.00%> (ø)
...untime/operator/block/FilteredDataBlockValSet.java 0.00% <0.00%> (ø)
.../pinot/query/runtime/plan/PhysicalPlanVisitor.java 0.00% <0.00%> (ø)
.../runtime/plan/server/ServerPlanRequestVisitor.java 0.00% <0.00%> (ø)

... and 18 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@walterddr walterddr force-pushed the pr_add_agg_filter branch from 0f55841 to 4c0165d Compare July 21, 2023 15:15
@walterddr walterddr marked this pull request as ready for review July 21, 2023 15:20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont know if this is the best API in DataBlockUtils to extract this. alternative is to extract the filter booleans and the IntValues separately, but i cannot make this change individually without changing how agg functions work.

Copy link
Contributor

@xiangfu0 xiangfu0 Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to create a FilteredTransferableBlock wrapping up TransferableBlock for AggregateOperator and others to use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling shouldn't use these many if-else checks

@Jackie-Jiang Jackie-Jiang added feature multi-stage Related to the multi-stage query engine labels Jul 22, 2023
@walterddr walterddr force-pushed the pr_add_agg_filter branch from 4c0165d to b18b6fd Compare July 28, 2023 00:57
@walterddr walterddr merged commit 0598955 into apache:master Jul 28, 2023
walterddr added a commit that referenced this pull request Jul 29, 2023
enable theta sketch test for v2 after #11144 and #11153 

Co-authored-by: Rong Rong <[email protected]>
s0nskar pushed a commit to s0nskar/pinot that referenced this pull request Aug 10, 2023
* [init][agg] support agg filter where clause
* limitation still applies for nullable vs non-nullable results and agg filter merging with select filter, will be address in follow ups.

---------

Co-authored-by: Rong Rong <[email protected]>
s0nskar pushed a commit to s0nskar/pinot that referenced this pull request Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants