Skip to content

Conversation

@walterddr
Copy link
Contributor

@walterddr walterddr commented Nov 2, 2023

this fixes: #11927.

after this PR, when running

SET maxRowsInJoin=1;
SET joinOverflowMode='BREAK';
WITH tmp AS (
  SELECT   /*+ aggOptions(num_groups_limit='2') */
	attr.userUUID,
	attr.deviceOS,
	min(attr.totalTrips)     min_total_trips,
	1                   agg_type 
  FROM userAttributes attr
  GROUP BY 1, 2
)
SELECT * FROM userGroups JOIN tmp ON userGroups.userUUID = tmp.userUUID

returns

  "stageStats": {
    "1": {
      "numBlocks": 28,
      "numRows": 1,
      "stageExecutionTimeMs": 136,
      "stageExecutionUnit": 7,
      "stageExecWallTimeMs": 35,
      "numJoinLimitReached": true        ------- show exactly which stage causes join limit reached
    },
    "2": {
      "numBlocks": 18,
      "numRows": 4988,
      "stageExecutionTimeMs": 40,
      "stageExecutionUnit": 4,
      "stageExecWallTimeMs": 15,
      "tableNames": [
        "userGroups"
      ]
    },
    "3": {
      "numBlocks": 4,
      "numRows": 8,
      "stageExecutionTimeMs": 102,
      "stageExecutionUnit": 8,
      "stageExecWallTimeMs": 29,
      "numGroupsLimitReached": true    ------- show exactly which stage causes group limit reached
    },
    "4": {
      "numBlocks": 6,
      "numRows": 20000,
      "stageExecutionTimeMs": 46,
      "stageExecutionUnit": 4,
      "stageExecWallTimeMs": 15,
      "tableNames": [
        "userAttributes"
      ]
    }
  },

@walterddr walterddr marked this pull request as ready for review November 2, 2023 16:10
OPERATOR_EXEC_START_TIME_MS(32, "operatorExecStartTimeMs", MetadataValueType.LONG),
OPERATOR_EXEC_END_TIME_MS(33, "operatorExecEndTimeMs", MetadataValueType.LONG);
OPERATOR_EXEC_END_TIME_MS(33, "operatorExecEndTimeMs", MetadataValueType.LONG),
NUM_JOIN_LIMIT_REACHED(34, "numJoinsLimitReached", MetadataValueType.STRING);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saurabhd336 this might affect the partial result flag setting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific. numJoinsLimit can cause confusion. Modify other places accordingly

Suggested change
NUM_JOIN_LIMIT_REACHED(34, "numJoinsLimitReached", MetadataValueType.STRING);
MAX_ROWS_IN_JOIN_REACHED(34, "maxRowsInJoinReached", MetadataValueType.STRING);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. should we be even more specific: MAX_ROWS_IN_JOIN_HASH_TABLE_REACHED ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. Maybe MAX_ROWS_IN_JOIN_TABLE_REACHED since we might optimize it to not use hash table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wilco

Copy link
Contributor Author

@walterddr walterddr Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it to maxRowsInJoinLimitReached b/c

  • the query option is named as maxRowsInJoin
  • other join algorithm, such as sort-merge join, there's no "table"

@walterddr walterddr marked this pull request as draft November 2, 2023 16:11
@walterddr walterddr added bugfix multi-stage Related to the multi-stage query engine labels Nov 2, 2023
@walterddr walterddr force-pushed the bugfix_warning_propagate branch from c998973 to 2f0be18 Compare November 2, 2023 18:54
@walterddr walterddr marked this pull request as ready for review November 2, 2023 19:27
@codecov-commenter
Copy link

codecov-commenter commented Nov 2, 2023

Codecov Report

Merging #11936 (df29db7) into master (8d49b11) will decrease coverage by 0.01%.
Report is 13 commits behind head on master.
The diff coverage is 92.30%.

@@             Coverage Diff              @@
##             master   #11936      +/-   ##
============================================
- Coverage     61.40%   61.40%   -0.01%     
+ Complexity     1145     1141       -4     
============================================
  Files          2378     2385       +7     
  Lines        128851   129081     +230     
  Branches      19926    19954      +28     
============================================
+ Hits          79127    79259     +132     
- Misses        44002    44092      +90     
- Partials       5722     5730       +8     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.37% <92.30%> (+26.63%) ⬆️
java-21 34.67% <88.46%> (-26.63%) ⬇️
skip-bytebuffers-false 61.40% <92.30%> (+<0.01%) ⬆️
skip-bytebuffers-true 0.00% <0.00%> (-61.26%) ⬇️
temurin 61.40% <92.30%> (-0.01%) ⬇️
unittests 61.39% <92.30%> (-0.01%) ⬇️
unittests1 46.64% <100.00%> (+0.02%) ⬆️
unittests2 27.60% <3.84%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...roker/requesthandler/BaseBrokerRequestHandler.java 45.77% <100.00%> (-0.23%) ⬇️
...a/org/apache/pinot/common/datatable/DataTable.java 92.18% <100.00%> (+0.12%) ⬆️
...t/common/response/broker/BrokerResponseNative.java 94.14% <100.00%> (+0.12%) ⬆️
...common/response/broker/BrokerResponseNativeV2.java 33.33% <ø> (ø)
...ot/common/response/broker/BrokerResponseStats.java 80.64% <ø> (ø)
...ot/core/query/reduce/ExecutionStatsAggregator.java 88.08% <100.00%> (+0.31%) ⬆️
...inot/query/runtime/operator/AggregateOperator.java 84.80% <100.00%> (+0.49%) ⬆️
...pinot/query/runtime/operator/HashJoinOperator.java 94.01% <100.00%> (-0.11%) ⬇️
...requesthandler/MultiStageBrokerRequestHandler.java 17.68% <0.00%> (-1.02%) ⬇️

... and 50 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add partial response for multi-stage broker request handler

}

@JsonProperty("maxRowsInJoinReached")
public boolean ismaxRowsInJoinReached() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public boolean ismaxRowsInJoinReached() {
public boolean isMaxRowsInJoinReached() {

// Just fill up the buffer.
int remainingRows = _maxRowsInHashTable - _currentRowsInHashTable;
container = container.subList(0, remainingRows);
OperatorStats operatorStats = _opChainStats.getOperatorStats(_context, _operatorId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we set it here, we can change _resourceLimitExceededException into local variable and only create it in THROW mode

@walterddr walterddr merged commit d177866 into apache:master Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[multistage][bug] warning exceptions not populate to broker

3 participants