-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Ingestion Aggregation Feature #8611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8611 +/- ##
=============================================
- Coverage 69.57% 14.24% -55.34%
+ Complexity 4572 168 -4404
=============================================
Files 1729 1684 -45
Lines 90179 88351 -1828
Branches 13413 13223 -190
=============================================
- Hits 62744 12585 -50159
- Misses 23083 74829 +51746
+ Partials 4352 937 -3415
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is in general good. We can make it more robust
pinot-common/src/main/java/org/apache/pinot/common/utils/grpc/GrpcRequestBuilder.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/AggregationConfig.java
Outdated
Show resolved
Hide resolved
pinot-common/src/test/java/org/apache/pinot/common/utils/config/TableConfigSerDeTest.java
Outdated
Show resolved
Hide resolved
...st/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImplTestUtils.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ingestion/IngestionConfig.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
ce69c7b to
0d24624
Compare
noon-stripe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments
pinot-common/src/main/java/org/apache/pinot/common/utils/grpc/GrpcRequestBuilder.java
Outdated
Show resolved
Hide resolved
pinot-common/src/test/java/org/apache/pinot/common/utils/config/TableConfigSerDeTest.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
...st/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImplTestUtils.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/AggregationConfig.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ingestion/IngestionConfig.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ingestion/IngestionConfig.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
0d24624 to
34431d4
Compare
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we are just maintaining the map between column to its source column and the value aggregator in the IngestionAggregator. Suggest removing the IngestionAggregator and add the source column and value aggregator into the IndexContainer. This way we can avoid a lot of unnecessary map lookups
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/IngestionUtils.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
25390fa to
eefc4ef
Compare
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good. Well done!
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
...al/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/MutableSegmentImpl.java
Outdated
Show resolved
Hide resolved
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this check after the null value check, or it will throw NPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which check? I believe these are in the correct order, right?
FieldSpec fieldSpec = schema.getFieldSpecFor(columnName);
Preconditions.checkState(fieldSpec != null, "The destination column '" + columnName
+ "' of the aggregation function must be present in the schema");
Preconditions.checkState(fieldSpec.getFieldType() == FieldSpec.FieldType.METRIC,
"The destination column '" + columnName + "' of the aggregation function must be a metric column");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this block before the field spec check:
String aggregationFunction = aggregationConfig.getAggregationFunction();
if (columnName == null || aggregationFunction == null) {
throw new IllegalStateException(
"columnName/aggregationFunction cannot be null in AggregationConfig " + aggregationConfig);
}
Without checking the columnName is not null, schema.getFieldSpecFor(columnName) could throw NPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider using SUM on metric itself as the default if not configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good suggestion, but I actually prefer to have the aggregation config be a requirement. Otherwise, you're altering the data without informing the user.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java
Outdated
Show resolved
Hide resolved
b1dd998 to
0d7bfd1
Compare
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this block before the field spec check:
String aggregationFunction = aggregationConfig.getAggregationFunction();
if (columnName == null || aggregationFunction == null) {
throw new IllegalStateException(
"columnName/aggregationFunction cannot be null in AggregationConfig " + aggregationConfig);
}
Without checking the columnName is not null, schema.getFieldSpecFor(columnName) could throw NPE
Description
This PR adds Ingestion Aggregation. The design doc can be found #8360.
This feature aggregates values at ingestion time, which reduces the number of rows stores (thus speeding up queries), using the same strategy as the 'aggregateMetrics' feature. This expands the feature to include, COUNT, MIN, and MAX, with the ability to expand further.
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
Yes (Please label as backward-incompat, and complete the section below on Release Notes)
Does this PR fix a zero-downtime upgrade introduced earlier?
Yes (Please label this as backward-incompat, and complete the section below on Release Notes)
Does this PR otherwise need attention when creating release notes? Things to consider:
New configuration options
Deprecation of configurations
Signature changes to public methods/interfaces
New plugins added or old plugins removed
Yes (Please label this PR as release-notes and complete the section on Release Notes)
Release Notes
Ingestion Pre-Aggregation is now supported for MIN, MAX, and COUNT, in addition to SUM. To enable the feature, add an aggregationConfig to the ingestionConfigs of a realtime table config. The format of the config is
"aggregationConfigs": [
{
"columnName": "destColumn",
"aggregationFunction": "MIN(srcColumn)"
}
],
The destColumn must be in the schema and the srcColumn must not be in the schema. Additionally, all destColumns must be noDictionaryColumns.
Documentation
After this PR I will add the documentation.