Skip to content

Conversation

@HxpSerein
Copy link
Collaborator

Description

Add region migration dest datanode selection algorithm for load balancing during removing datanode.

Please see https://apache-iotdb-project.feishu.cn/docx/G5aPdZJ4QozqmkxO9dsc7sEsnTe for detail design.

@liyuheng55555 liyuheng55555 self-requested a review April 15, 2025 06:17
@HxpSerein HxpSerein requested a review from Copilot April 15, 2025 06:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a region migration data node selection algorithm for load balancing during the removal of a datanode. Key changes include:

  • Adding a new unit test to validate the replica selection logic.
  • Replacing the old region migration plan retrieval with a new selectedRegionMigrationPlans method.
  • Implementing (or stubbing) removeNodeReplicaSelect methods in several region group allocator classes — with a full DFS‐based implementation in the GreedyCopySetRegionGroupAllocator.
  • Refactoring helper methods in RemoveDataNodeHandler to support the new migration plan selection.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
iotdb-core/confignode/src/test/java/org/apache/iotdb/confignode/manager/load/balancer/region/GreedyCopySetRemoveNodeReplicaSelectTest.java Added tests covering replica migration logic; review the randomness used in free space setup.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/impl/node/RemoveDataNodesProcedure.java Updated procedure to use selectedRegionMigrationPlans and to select the destination node from the migration plan.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/env/RemoveDataNodeHandler.java Introduced new methods for building migration plans and updating replica sets.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/balancer/region/PartiteGraphPlacementRegionGroupAllocator.java Stubbed removeNodeReplicaSelect method as a TODO for this allocation policy.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/balancer/region/GreedyCopySetRegionGroupAllocator.java Extended DFS-based replica selection logic for datanode removal and updated allocation DFS methods.
Others Interface and GreedyRegionGroupAllocator updates to include the new method signature.
Comments suppressed due to low confidence (4)

iotdb-core/confignode/src/test/java/org/apache/iotdb/confignode/manager/load/balancer/region/GreedyCopySetRemoveNodeReplicaSelectTest.java:72

  • Using Math.random() to set up free space may lead to non-deterministic test outcomes; consider using a fixed seed or predefined values to ensure reproducibility.
FREE_SPACE_MAP.put(i, Math.random());

iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/balancer/region/GreedyCopySetRegionGroupAllocator.java:139

  • Directly selecting the first element from the values of databaseAllocatedRegionGroupMap may lead to unexpected behavior if multiple databases are present; consider clarifying or validating the intended behavior.
List<TRegionReplicaSet> databaseAllocatedRegionGroups = new ArrayList<>(databaseAllocatedRegionGroupMap.values()).get(0);

iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/env/RemoveDataNodeHandler.java:353

  • Removing and then re-adding a replicaSet may alter the original ordering of allocatedReplicaSets; please ensure that any order-dependencies in subsequent logic are either updated or explicitly documented.
allocatedReplicaSets.remove(replicaSet);

iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/balancer/region/GreedyCopySetRegionGroupAllocator.java:310

  • [nitpick] The DFS approach used in removeNodeReplicaSelect may exhibit performance issues as the number of regions grows; consider reviewing the scalability of the DFS and potentially incorporating additional pruning heuristics.
if (optimalAssignments.size() >= GCR_MAX_OPTIMAL_PLAN_NUM) {

Copy link
Collaborator

@liyuheng55555 liyuheng55555 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Serein,

I haven’t finished reading the entire PR yet, for now there are a few points I’d like to discuss with you :)

Copy link
Collaborator

@liyuheng55555 liyuheng55555 Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the execution of the removeNodeReplicaSelect function not strongly coupled with GreedyCopySetRegionGroupAllocator?

From my observation, this coupling only exists in the three class members: regionCounter, databaseRegionCounter, and combinationCounter.

In this case, perhaps placing removeNodeReplicaSelect in a new class would make things clearer, as the original class and algorithm are already complex enough.

} else if (isEqual) {
optimalAssignments.add(Arrays.copyOf(currentAssignment, n));
// Prune search if we already have enough candidate solutions
if (optimalAssignments.size() >= GCR_MAX_OPTIMAL_PLAN_NUM) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the DFS will end when the number of search results reaches GCR_MAX_OPTIMAL_PLAN_NUM.

Does the algorithm always guarantee finding the optimal solutions? If the algorithm first finds GCR_MAX_OPTIMAL_PLAN_NUM suboptimal solutions, will the DFS end?

Copy link
Collaborator Author

@HxpSerein HxpSerein Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. DFS will try its best to find the best solution. It is not guaranteed to find it, but in most cases, it works well.

Map<String, List<TRegionReplicaSet>> databaseAllocatedRegionGroupMap,
Map<TConsensusGroupId, TRegionReplicaSet> remainReplicasMap) {
// TODO: Implement this method
return Collections.emptyMap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw UnsupportedOperationException will be better

Map<String, List<TRegionReplicaSet>> databaseAllocatedRegionGroupMap,
Map<TConsensusGroupId, TRegionReplicaSet> remainReplicasMap) {
// TODO: Implement this method
return Collections.emptyMap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throw UnsupportedOperationException will be better

Copy link
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!!!!

@CRZbulabula CRZbulabula merged commit 7650b47 into apache:master Apr 27, 2025
47 of 50 checks passed
@HxpSerein HxpSerein deleted the gcr_balance branch May 7, 2025 07:01
JackieTien97 pushed a commit that referenced this pull request May 14, 2025
HxpSerein added a commit to HxpSerein/iotdb that referenced this pull request Nov 7, 2025
HxpSerein added a commit that referenced this pull request Nov 7, 2025
* [remove datanode] GCR load balancing implement for removing datanode (#15282)

(cherry picked from commit 7650b47)

* [remove datanode] Fix IoTDBRemoveDataNodeNormalIT #15429

(cherry picked from commit 9537806)

* [remove datanode] Accelerate GCR load balancing implement (#15535)

(cherry picked from commit 51bad1e)

* [remove datanode] Fix ArrayIndexOutOfBoundsException in computeInitialDbLoad (#15718)

(cherry picked from commit 346ee72)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants