-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Added multi column partitioning for offline table #8255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added multi column partitioning for offline table #8255
Conversation
6e34564 to
15e04b3
Compare
Codecov Report
@@ Coverage Diff @@
## master #8255 +/- ##
============================================
+ Coverage 70.78% 70.79% +0.01%
- Complexity 4243 4317 +74
============================================
Files 1631 1690 +59
Lines 85281 88583 +3302
Branches 12845 13429 +584
============================================
+ Hits 60364 62714 +2350
- Misses 20746 21478 +732
- Partials 4171 4391 +220
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
36ff76b to
c1b5a0f
Compare
c1b5a0f to
e6c6adc
Compare
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a queries test (search for *QueriesTest for examples) or integration test to ensure it works end-to-end?
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Outdated
Show resolved
Hide resolved
I have added a test which prune segments based on query - and Sure let me see if I can add an integration test. |
|
Hi @Jackie-Jiang Could you please check this PR if it can be merge now? |
|
@kmozaid This is a great feature, and I want to ensure it can work end-to-end. With the |
|
Thanks @Jackie-Jiang , Sure, Let me add integration test. |
Hi @Jackie-Jiang I have updated existing merge rollup minion task and realtime-to-offline minion task integration test which involves partitioning.Now these tests covered multi column partitioning flow. Please check if it is still required to write a separate integration tests. |
...rc/test/java/org/apache/pinot/integration/tests/MergeRollupMinionClusterIntegrationTest.java
Outdated
Show resolved
Hide resolved
...rc/test/java/org/apache/pinot/integration/tests/MergeRollupMinionClusterIntegrationTest.java
Show resolved
Hide resolved
...rg/apache/pinot/integration/tests/RealtimeToOfflineSegmentsMinionClusterIntegrationTest.java
Show resolved
Hide resolved
...rg/apache/pinot/integration/tests/RealtimeToOfflineSegmentsMinionClusterIntegrationTest.java
Show resolved
Hide resolved
...-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/MergeTaskUtils.java
Show resolved
Hide resolved
...l/src/main/java/org/apache/pinot/segment/local/indexsegment/mutable/IntermediateSegment.java
Show resolved
Hide resolved
|
There are three other places where multicolumn partitioning config is not allowed.
This feature is NOT support by REALTIME tables so first 2 checks are still valid. The last one is specific to Hadoop pre processing job, I think we can skip for this feature as we don't really support |
|
Test Cases Covered -
|
|
Hi @Jackie-Jiang , Could you please provide review feedback? please also trigger the workflow again. |
...oker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
Outdated
Show resolved
Hide resolved
...rg/apache/pinot/integration/tests/RealtimeToOfflineSegmentsMinionClusterIntegrationTest.java
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
Show resolved
Hide resolved
29af1e3 to
d10f61b
Compare
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments. Thanks for adding the tests!
| _partitions = partitions; | ||
| } | ||
| } | ||
| public interface PartitionSegmentPruner extends SegmentPruner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this file since this extra interface does not provide extra value (it should show as a rename from PartitionSegmentPruner to SinglePartitionColumnSegmentPruner)
...broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/SegmentPrunerFactory.java
Show resolved
Hide resolved
...n-test-base/src/test/java/org/apache/pinot/integration/tests/BaseClusterIntegrationTest.java
Outdated
Show resolved
Hide resolved
...rc/test/java/org/apache/pinot/integration/tests/MergeRollupMinionClusterIntegrationTest.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskGenerator.java
Outdated
Show resolved
Hide resolved
713a12e to
5e2afae
Compare
5e2afae to
2c4e35e
Compare
Description
This PR adds support for multi column partitioning for minion tasks and broker segment pruner.
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
Does this PR fix a zero-downtime upgrade introduced earlier?
Does this PR otherwise need attention when creating release notes? Things to consider:
release-notesand complete the section on Release Notes)Release Notes
Segment can be partitioned on multiple columns when they are moved/created by minion tasks and when data is ingested via batch ingestion. Broker can also prune segments on multiple columns partition metadata.