Skip to content

Conversation

@klsince
Copy link
Contributor

@klsince klsince commented Oct 18, 2022

This PR is part of the work to support multi-datadir for Pinot server as tracked by #8843.

Following up #9598 and #9306, this one has extended TableRebalancer to find out which segments need to change their target tier, and reuse SegmentReloadMessage to notify servers to move segment to target tier locally. Segments that need to relocate to new servers are still handled by the existing logic of table rebalancer (i.e. by updating the segment placement in Helix ideal state). The new logic here only kicks in when all segments are already placed on their ideal servers.

Release Note

controller.segmentRelocator.enableLocalTierMigration - to turn this on/off; disabled by default.

@klsince klsince force-pushed the notify_with_segment_reload_msg branch from 577dc49 to 3172632 Compare October 18, 2022 21:46
@codecov-commenter
Copy link

codecov-commenter commented Oct 18, 2022

Codecov Report

Merging #9624 (7609a9c) into master (77fa36d) will decrease coverage by 2.19%.
The diff coverage is 16.66%.

@@             Coverage Diff              @@
##             master    #9624      +/-   ##
============================================
- Coverage     28.09%   25.89%   -2.20%     
+ Complexity       53       44       -9     
============================================
  Files          1935     1934       -1     
  Lines        103815   103917     +102     
  Branches      15758    15770      +12     
============================================
- Hits          29162    26913    -2249     
- Misses        71780    74297    +2517     
+ Partials       2873     2707     -166     
Flag Coverage Δ
integration1 25.89% <16.66%> (-0.01%) ⬇️
integration2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../local/loader/TierBasedSegmentDirectoryLoader.java 0.00% <0.00%> (ø)
...roller/helix/core/relocation/SegmentRelocator.java 26.42% <10.47%> (-46.55%) ⬇️
...server/starter/helix/HelixInstanceDataManager.java 66.25% <20.00%> (-9.09%) ⬇️
...ntroller/helix/core/PinotHelixResourceManager.java 39.98% <33.33%> (-3.04%) ⬇️
...er/starter/helix/SegmentMessageHandlerFactory.java 59.55% <40.00%> (-6.34%) ⬇️
...he/pinot/common/messages/SegmentReloadMessage.java 83.33% <50.00%> (-16.67%) ⬇️
...apache/pinot/controller/BaseControllerStarter.java 76.06% <100.00%> (-1.52%) ⬇️
...va/org/apache/pinot/controller/ControllerConf.java 49.23% <100.00%> (-2.29%) ⬇️
.../pinot/core/data/manager/BaseTableDataManager.java 60.00% <100.00%> (-1.27%) ⬇️
.../apache/pinot/server/access/RequesterIdentity.java 0.00% <0.00%> (-100.00%) ⬇️
... and 218 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Jackie-Jiang Jackie-Jiang added feature release-notes Referenced by PRs that need attention when compiling the next release notes labels Oct 18, 2022
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level question: why is the logic implemented in the table rebalancer? IMO it should be part of the tier assigner

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a separate config for this? Isn't this the same as tier assigner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentTierAssigner just calculates target tiers for segments, but not trigger the real migration.

The SegmentRelocator is extended to kick off the tier migration automatically. But it can be turned off via this knob, if someone wants to trigger this manually to fully control when the migration starts.

@klsince
Copy link
Contributor Author

klsince commented Oct 18, 2022

High level question: why is the logic implemented in the table rebalancer? IMO it should be part of the tier assigner

The design decision was made to extend SegmentRelocator (not Assigner either to separate responsibilities). I extended TableRebalancer (used by SegmentRelocator) to check on a condition i.e.

if (currentAssignment.equals(targetAssignment)) {
      LOGGER.info("Table: {} is already balanced", tableNameWithType);
...

But looks like I can check on RebalanceResult.Status.DONE anyway, to move tier migration logic into SegmentRelocator to be clear.

@Jackie-Jiang
Copy link
Contributor

But why do we want to use a separate periodic task to trigger the reload? We should just trigger the reload in SegmentTierAssigner which modify the tier for the segment. Once the SegmentTierAssigner changes the target tier for a segment, the new loaded segment will already be moved, and we want the already loaded segment to have the same behavior to be consistent.

@klsince
Copy link
Contributor Author

klsince commented Oct 19, 2022

A new target tier can lead to two reactions:

  1. segment needs to go to a new server (when the target tier is only defined for a specific pool of servers as identified via serverTag in the tierConfig);
  2. segment is already on the right server, but still needs to go to a new tier, backed by a different datadir.

Action 1 is handled by TableRebalancer; Action 2 is handled by SegmentReloadMessage. Extending SegmentRelocator, it's easy to control the sequence of those two actions, to do Action 2 after Action 1 is completed. So that, we don't send SegmentReloadMessage to a server that's not the ideal server for the segment yet, and it reloads the segment unnecessarily.

@Jackie-Jiang
Copy link
Contributor

It is okay to do it within the SegmentRelocator, but we should also move the logic in SegmentTierAssigner into it. We don't want to use 2 periodic tasks to perform the same operation. Basically after assigning the tier, we should trigger the reload immediately, instead of relying on another periodic task to trigger it.

@klsince
Copy link
Contributor Author

klsince commented Oct 20, 2022

... Basically after assigning the tier, we should trigger the reload immediately, instead of relying on another periodic task to trigger it.

I think there may be use cases to separate the two: 1) keep SegmentTierAssigner running to calculate target tier continuously; 2) but not run SegmentRelocator task, instead do table rebalance and tier migration manually as planned maintenance based on prod env, like when cluster does not get queries.

@klsince klsince force-pushed the notify_with_segment_reload_msg branch from a9e5e58 to 6b8771c Compare October 21, 2022 15:41
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

@klsince klsince force-pushed the notify_with_segment_reload_msg branch from 6b8771c to 23feef3 Compare October 25, 2022 05:01
@klsince klsince force-pushed the notify_with_segment_reload_msg branch from 23feef3 to 64a292c Compare October 25, 2022 06:05
@klsince klsince force-pushed the notify_with_segment_reload_msg branch from 64a292c to 7609a9c Compare October 25, 2022 17:09
@Jackie-Jiang Jackie-Jiang merged commit 5539eb6 into apache:master Oct 25, 2022
@klsince klsince deleted the notify_with_segment_reload_msg branch October 25, 2022 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature release-notes Referenced by PRs that need attention when compiling the next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants