Allow peer to bootstrap with used URI if empty#8301
Conversation
Prompt: I have a Qdrant cluster in distributed mode with some nodes registered in consensus. If I bootstrap a new node with an URL that is already used it is rejected and an error is returned. I would like to change this behavior and allow this to happen. This would effectively replace a peer because internally we'd drop the existing peer first, and then we'd add the new peer so we can reuse the same peer URL. We must still reject bootstrapping with the same URL if the peer that used the URL before us still has any shards on it.
Prompt: Add two tests to assert the new behavior. The first test should: - create a cluster - create a collection - bootstrap a new peer - kill and delete the local data for this peer without removing it from consensus - bootstrap a new peer but reuse the URI of the peer we just killed to rejoin - bootstrapping is expected to succeed The second test should: - create a cluster - create a collection - bootstrap a new peer but reuse the URI of the last node - bootstrapping is expected to fail because the node being replaced has data on it
94ca810 to
10151e8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/consensus_tests/test_cluster_rejoin.py (1)
415-518: Cover both URI-source branches in these new regressions.Unlike the existing same-URI regression earlier in this file, these two tests are not parameterized over
uris_in_env, so they exercise only oneadd_peer_to_knownpath. Running the new replace/reject cases in both modes would give much better coverage for the new behavior.🧪 Minimal coverage update
+@pytest.mark.parametrize("uris_in_env", [False, True]) -def test_replace_peer_without_shards_same_uri(tmp_path: pathlib.Path): +def test_replace_peer_without_shards_same_uri(tmp_path: pathlib.Path, uris_in_env): @@ - peer_api_uris, peer_dirs, bootstrap_uri = start_cluster(tmp_path, N_PEERS, port_seed=10000) + peer_api_uris, _peer_dirs, bootstrap_uri = start_cluster( + tmp_path, + N_PEERS, + port_seed=10000, + uris_in_env=uris_in_env, + ) @@ - extra_peer_api_uri = start_peer(extra_peer_dir, "peer_extra.log", bootstrap_uri, port=extra_peer_port) + extra_peer_api_uri = start_peer( + extra_peer_dir, + "peer_extra.log", + bootstrap_uri, + port=extra_peer_port, + uris_in_env=uris_in_env, + ) @@ - new_extra_peer_api_uri = start_peer(new_extra_peer_dir, "peer_extra_replaced.log", bootstrap_uri, port=extra_peer_port) + new_extra_peer_api_uri = start_peer( + new_extra_peer_dir, + "peer_extra_replaced.log", + bootstrap_uri, + port=extra_peer_port, + uris_in_env=uris_in_env, + ) +@pytest.mark.parametrize("uris_in_env", [False, True]) -def test_reject_replace_peer_with_shards_same_uri(tmp_path: pathlib.Path): +def test_reject_replace_peer_with_shards_same_uri(tmp_path: pathlib.Path, uris_in_env): @@ - peer_api_uris, peer_dirs, bootstrap_uri = start_cluster(tmp_path, N_PEERS, port_seed=10000) + peer_api_uris, peer_dirs, bootstrap_uri = start_cluster( + tmp_path, + N_PEERS, + port_seed=10000, + uris_in_env=uris_in_env, + ) @@ - new_url = start_peer(peer_dirs[-1], "peer_replaced_rejected.log", bootstrap_uri, port=broken_peer_port) + start_peer( + peer_dirs[-1], + "peer_replaced_rejected.log", + bootstrap_uri, + port=broken_peer_port, + uris_in_env=uris_in_env, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/consensus_tests/test_cluster_rejoin.py` around lines 415 - 518, Add coverage for both URI-source branches by running each test case with uris_in_env set to both True and False; wrap the existing logic in test_replace_peer_without_shards_same_uri and test_reject_replace_peer_with_shards_same_uri in a small loop or parameterization that toggles the environment mode and reboots the cluster for each mode, so both add_peer_to_known code paths are exercised when calling start_cluster/start_peer; ensure you restore/clear the environment between iterations and reference the test functions (test_replace_peer_without_shards_same_uri, test_reject_replace_peer_with_shards_same_uri) and helper functions (start_cluster, start_peer, add_peer_to_known) when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/consensus_tests/test_cluster_rejoin.py`:
- Around line 415-518: Add coverage for both URI-source branches by running each
test case with uris_in_env set to both True and False; wrap the existing logic
in test_replace_peer_without_shards_same_uri and
test_reject_replace_peer_with_shards_same_uri in a small loop or
parameterization that toggles the environment mode and reboots the cluster for
each mode, so both add_peer_to_known code paths are exercised when calling
start_cluster/start_peer; ensure you restore/clear the environment between
iterations and reference the test functions
(test_replace_peer_without_shards_same_uri,
test_reject_replace_peer_with_shards_same_uri) and helper functions
(start_cluster, start_peer, add_peer_to_known) when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ec8c8b22-44d9-4c79-9ad5-5cc787fbe8ee
📒 Files selected for processing (3)
lib/storage/src/content_manager/consensus_manager.rssrc/tonic/api/raft_api.rstests/consensus_tests/test_cluster_rejoin.py
| let existing_peer_id = self | ||
| .consensus_state | ||
| .peer_address_by_id() | ||
| .into_iter() | ||
| .find(|(id, peer_uri)| *peer_uri == uri && *id != peer.id) | ||
| .map(|(id, _)| id); | ||
|
|
||
| if let Some(old_peer_id) = existing_peer_id { | ||
| let consensus_state = self.consensus_state.clone(); | ||
| let has_shards = tokio::task::spawn_blocking(move || { | ||
| // Must use spawn_blocking for peer_has_shards | ||
| consensus_state.peer_has_shards(old_peer_id) | ||
| }) | ||
| .await | ||
| .map_err(|err| Status::internal(format!("Failed to check shards: {err}")))?; | ||
|
|
||
| if has_shards { | ||
| return Err(Status::failed_precondition(format!( | ||
| "peer URI {uri} already used by peer {old_peer_id} which still has shards, \ | ||
| remove its shards first or use a different URI", | ||
| ))); | ||
| } | ||
|
|
||
| log::info!( | ||
| "Peer {} is replacing peer {old_peer_id} which has no shards ({uri})", | ||
| peer.id, | ||
| ); | ||
|
|
||
| self.consensus_state | ||
| .propose_consensus_op_with_await(ConsensusOperations::RemovePeer(old_peer_id), None) | ||
| .await | ||
| .map_err(|err| { | ||
| Status::internal(format!("Failed to remove old peer {old_peer_id}: {err}")) | ||
| })?; | ||
| } |
There was a problem hiding this comment.
Move the replace-if-empty guard into a single consensus-side mutation.
This precondition is checked on the API side and then enforced later via separate RemovePeer and subsequent AddPeer proposals. Because identical in-flight ops are coalesced by propose_consensus_op_with_await, two concurrent joiners for the same URI can both ride the same RemovePeer(old_peer_id) and then each send their own AddPeer; shard ownership can also change after peer_has_shards returns but before the removal is committed. Please linearize the URI + shard-empty check on the Raft side (single replace op or apply-time recheck), otherwise correctness depends on timing.
* [ai] Allow peer to bootstrap with existing URI if peer has no shards Prompt: I have a Qdrant cluster in distributed mode with some nodes registered in consensus. If I bootstrap a new node with an URL that is already used it is rejected and an error is returned. I would like to change this behavior and allow this to happen. This would effectively replace a peer because internally we'd drop the existing peer first, and then we'd add the new peer so we can reuse the same peer URL. We must still reject bootstrapping with the same URL if the peer that used the URL before us still has any shards on it. * [ai] Add test to assert new behavior, can rejoin if empty Prompt: Add two tests to assert the new behavior. The first test should: - create a cluster - create a collection - bootstrap a new peer - kill and delete the local data for this peer without removing it from consensus - bootstrap a new peer but reuse the URI of the peer we just killed to rejoin - bootstrapping is expected to succeed The second test should: - create a cluster - create a collection - bootstrap a new peer but reuse the URI of the last node - bootstrapping is expected to fail because the node being replaced has data on it * [ai] Attempt to fix new tests * Reformat * [ai] Fix deadlock when rejoining with same peer URI * [ai] Add test to ensure existing peer stops consensus on replace
* [ai] Allow peer to bootstrap with existing URI if peer has no shards Prompt: I have a Qdrant cluster in distributed mode with some nodes registered in consensus. If I bootstrap a new node with an URL that is already used it is rejected and an error is returned. I would like to change this behavior and allow this to happen. This would effectively replace a peer because internally we'd drop the existing peer first, and then we'd add the new peer so we can reuse the same peer URL. We must still reject bootstrapping with the same URL if the peer that used the URL before us still has any shards on it. * [ai] Add test to assert new behavior, can rejoin if empty Prompt: Add two tests to assert the new behavior. The first test should: - create a cluster - create a collection - bootstrap a new peer - kill and delete the local data for this peer without removing it from consensus - bootstrap a new peer but reuse the URI of the peer we just killed to rejoin - bootstrapping is expected to succeed The second test should: - create a cluster - create a collection - bootstrap a new peer but reuse the URI of the last node - bootstrapping is expected to fail because the node being replaced has data on it * [ai] Attempt to fix new tests * Reformat * [ai] Fix deadlock when rejoining with same peer URI * [ai] Add test to ensure existing peer stops consensus on replace
Implemented by AI.
Allow a new peer to bootstrap with a used URI as long as no shards belong to that URI. This allows peers to rejoin the cluster if it somehow failed to de-register itself from consensus. This normally isn't easily possible if storage was already wiped.
Note this only allows a peer to rejoin if it currently holds no shards. If it does have shards, bootstrapping is aborted. It's important not to lose any shards.
Specifically rejoining does:
Some tests are included to assert behavior.
All Submissions:
devbranch. Did you create your branch fromdev?New Feature Submissions:
cargo +nightly fmt --allcommand prior to submission?cargo clippy --workspace --all-featurescommand?Changes to Core Features: