Make resharding operations (shard holder) idempotent by timvisee · Pull Request #8789 · qdrant/qdrant

timvisee · 2026-04-24T14:51:05Z

Fixes #8709
Extends #8522

In all resharding operations, make each individual step idempotent. If a resharding operation has been partially applied, we fall through applied steps and still apply the remaining steps.

This fixes resharding on consensus WAL replay. We've seen resharding consensus state inconsistencies in chaos testing. I have a strong signal these are happening because of broken consensus WAL replay.

This also prevents creating too many shards, losing a shard, or generally breaking the shard holder when (re)applying resharding consensus operations. Now this works with exact shard IDs (idempotent) rather than adding or subtracting (not idempotent) a shard.

I'd like to extensively test this on chaos testing before we include this in a new release. Resharding is very complicated, and it's hard to fully comprehend all changes and understand all the possible side effects.

I recommend to review per commit.

Tasks

Make sure custom sharding is also idempotent in shard holder

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
Have you checked your code using cargo clippy --workspace --all-features command?

Changes to Core Features:

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

Why: consensus entries may be re-applied after a crash (partial state on disk) or during raft recovery. The unchecked state-transition helpers used `debug_assert!` to require a specific starting state, so a replay would panic in debug or silently overwrite in release. How to apply: use write_optional so the state file is only touched when the in-memory state needs to change. This also avoids unnecessary fsyncs when a replay is a no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Why: on replay, `create_replica_set` would call `create_shard_dir`, which removes and recreates the existing shard directory -- wiping any shard contents that had already been migrated or written since the first apply. How to apply: check `contains_shard(shard_id)` before creating the replica set and pass `None` to `start_resharding_unchecked` when a replica set with the target shard id is already present. Also relaxes `check_start_resharding` to return `Ok` (instead of a swallowed `bad_request`) when a matching resharding state is persisted, so the caller can fall through each idempotent step. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Why: the start/finish/abort paths used ++/-- on the persisted shard number with a `debug_assert` pinning the expected starting value. On replay (e.g. after a crash between the shard holder mutation and the config save) this either panics or produces a wrong count. Since resharding always targets the last shard (auto sharding assigns contiguous ids from zero), the target count is a pure function of the shard id: `shard_id + 1` for start up, `shard_id` for finish down and abort up. Set it directly and skip the save when it already matches, so replay converges to the same value without touching the file. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Why: the early return on `resharding_state.is_none()` assumed that a missing state means the operation was fully applied. But a crash can leave the state cleared while the shard drop, key mapping removal, or shard count update are still pending. Short-circuiting then skips the reconciling work those replays need to do. How to apply: drop the early return so every step runs; each step is already individually idempotent (check_*, drop_and_remove_shard, remove_shard_from_key_mapping, the set-based shard count update). The abort_resharding down-invalidation path now reads nodes from the router regardless of its variant, so a replay over an already-rolled- back ring doesn't trip the removed debug_assert!s. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Why: `write_optional` returns `Some` whenever the shard key is present, even if the shard id we're removing is already gone. That still triggers a JSON rewrite and a cache notification on every replay of a finished finish/abort. How to apply: check that the shard id is actually in the set before returning `Some`. If the set is missing or already doesn't contain the id, return `None` so the on-disk file and in-memory data are left untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Covers the paths that must not error or panic on replay: - `check_start_resharding` when matching state is already present - `start_resharding_unchecked` preserves matching state verbatim - `finish_resharding_unchecked` when state is already cleared Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Replace a manual `match` on `Option` with `as_ref().map(...)` to satisfy clippy::manual_map. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

tellet-q

lgtm

timvisee · 2026-04-29T14:16:41Z

This now includes the test from #8839

ffuugoo

Seems Ok.

Co-authored-by: Roman Titov <[email protected]>

coderabbitai

🧹 Nitpick comments (1)

lib/collection/src/shards/shard_holder/resharding.rs (1)

997-1011: ⚡ Quick win

Make this replay test seed the ring into resharding mode too.

A real replay after a successful start_resharding has both persisted ReshardState and an already-resharding router. This test seeds only the state, so it would still pass even if a second ring.start_resharding(...) were not replay-safe.

Suggested test adjustment

         // Pre-seed matching state to simulate a replay after a successful
         // start had already persisted the resharding state.
+        holder
+            .rings
+            .get_mut(&key.shard_key)
+            .unwrap()
+            .start_resharding(key.shard_id, key.direction);
         holder
             .resharding_state
             .write(|state| {
                 *state = Some(expected_state.clone());
             })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/shards/shard_holder/resharding.rs` around lines 997 -
1011, The test currently seeds only holder.resharding_state with expected_state
but doesn't mark the router/ring as already in resharding mode, so replay safety
of ring.start_resharding isn't exercised; after writing the resharding_state
(the block that sets *state = Some(expected_state.clone())), also put the ring
into resharding mode to mirror a real replay — e.g. invoke the ring's resharding
entrypoint (call the same method used in normal start, such as
holder.ring.start_resharding(...) or the ring's equivalent API) with arguments
that match the persisted state so the ring is already marked as resharding
before calling holder.start_resharding_unchecked(key.clone(), None). This
ensures the test validates idempotence when both persisted ReshardState and an
already-resharding router exist.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@lib/collection/src/shards/shard_holder/resharding.rs`:
- Around line 997-1011: The test currently seeds only holder.resharding_state
with expected_state but doesn't mark the router/ring as already in resharding
mode, so replay safety of ring.start_resharding isn't exercised; after writing
the resharding_state (the block that sets *state =
Some(expected_state.clone())), also put the ring into resharding mode to mirror
a real replay — e.g. invoke the ring's resharding entrypoint (call the same
method used in normal start, such as holder.ring.start_resharding(...) or the
ring's equivalent API) with arguments that match the persisted state so the ring
is already marked as resharding before calling
holder.start_resharding_unchecked(key.clone(), None). This ensures the test
validates idempotence when both persisted ReshardState and an already-resharding
router exist.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ebe56e04-22dc-4c90-9f22-168aebb8cea7

📥 Commits

Reviewing files that changed from the base of the PR and between a52e6e5 and ecae489.

📒 Files selected for processing (1)

lib/collection/src/shards/shard_holder/resharding.rs

coderabbitai

🧹 Nitpick comments (1)

lib/collection/src/shards/shard_holder/resharding.rs (1)

167-170: Make match expressions exhaustive by replacing catch-all _ arms with explicit patterns.

Lines 167–170 and 480–483 use catch-all arms when matching Option<T>. Expand these to explicitly handle Some(_) and None branches for clarity and consistency with repository guidelines on exhaustive pattern matching.

Proposed changes

         self.resharding_state.write_optional(|state| {
             let new_state =
                 ReshardState::new(uuid, direction, peer_id, shard_id, shard_key.clone());
             match state {
                 Some(existing) if *existing == new_state => None,
-                _ => Some(Some(new_state)),
+                Some(_) => Some(Some(new_state)),
+                None => Some(Some(new_state)),
             }
         })?;

         self.resharding_state.write_optional(|state| match state {
             Some(state) if state.matches(&resharding_key) => Some(None),
-            _ => None,
+            Some(_) => None,
+            None => None,
         })?;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/shards/shard_holder/resharding.rs` around lines 167 - 170,
The match on state in resharding.rs should avoid a catch-all `_` arm; update the
expression in the function containing the fragment (the match that currently
does `match state { Some(existing) if *existing == new_state => None, _ =>
Some(Some(new_state)), }`) to explicitly handle `Some(_)` and `None` (e.g.,
`Some(existing) if *existing == new_state => None, Some(_) =>
Some(Some(new_state)), None => Some(Some(new_state))`) and make the analogous
change for the other match at the second occurrence around the `lines 480–483`
so both matches are exhaustive and use explicit `Some(_)` and `None` patterns
instead of `_`.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@lib/collection/src/shards/shard_holder/resharding.rs`:
- Around line 167-170: The match on state in resharding.rs should avoid a
catch-all `_` arm; update the expression in the function containing the fragment
(the match that currently does `match state { Some(existing) if *existing ==
new_state => None, _ => Some(Some(new_state)), }`) to explicitly handle
`Some(_)` and `None` (e.g., `Some(existing) if *existing == new_state => None,
Some(_) => Some(Some(new_state)), None => Some(Some(new_state))`) and make the
analogous change for the other match at the second occurrence around the `lines
480–483` so both matches are exhaustive and use explicit `Some(_)` and `None`
patterns instead of `_`.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 52ddf8d2-a83e-43e4-bc92-2d71d62bcd88

📥 Commits

Reviewing files that changed from the base of the PR and between ecae489 and 8f105de.

📒 Files selected for processing (1)

lib/collection/src/shards/shard_holder/resharding.rs

* Make resharding state transitions idempotent on replay Why: consensus entries may be re-applied after a crash (partial state on disk) or during raft recovery. The unchecked state-transition helpers used `debug_assert!` to require a specific starting state, so a replay would panic in debug or silently overwrite in release. How to apply: use write_optional so the state file is only touched when the in-memory state needs to change. This also avoids unnecessary fsyncs when a replay is a no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Skip replica set creation on resharding start replay Why: on replay, `create_replica_set` would call `create_shard_dir`, which removes and recreates the existing shard directory -- wiping any shard contents that had already been migrated or written since the first apply. How to apply: check `contains_shard(shard_id)` before creating the replica set and pass `None` to `start_resharding_unchecked` when a replica set with the target shard id is already present. Also relaxes `check_start_resharding` to return `Ok` (instead of a swallowed `bad_request`) when a matching resharding state is persisted, so the caller can fall through each idempotent step. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Derive resharding shard count from shard id for idempotency Why: the start/finish/abort paths used ++/-- on the persisted shard number with a `debug_assert` pinning the expected starting value. On replay (e.g. after a crash between the shard holder mutation and the config save) this either panics or produces a wrong count. Since resharding always targets the last shard (auto sharding assigns contiguous ids from zero), the target count is a pure function of the shard id: `shard_id + 1` for start up, `shard_id` for finish down and abort up. Set it directly and skip the save when it already matches, so replay converges to the same value without touching the file. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Fall through resharding finish/abort steps on replay Why: the early return on `resharding_state.is_none()` assumed that a missing state means the operation was fully applied. But a crash can leave the state cleared while the shard drop, key mapping removal, or shard count update are still pending. Short-circuiting then skips the reconciling work those replays need to do. How to apply: drop the early return so every step runs; each step is already individually idempotent (check_*, drop_and_remove_shard, remove_shard_from_key_mapping, the set-based shard count update). The abort_resharding down-invalidation path now reads nodes from the router regardless of its variant, so a replay over an already-rolled- back ring doesn't trip the removed debug_assert!s. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Skip shard key mapping write when nothing to remove Why: `write_optional` returns `Some` whenever the shard key is present, even if the shard id we're removing is already gone. That still triggers a JSON rewrite and a cache notification on every replay of a finished finish/abort. How to apply: check that the shard id is actually in the set before returning `Some`. If the set is missing or already doesn't contain the id, return `None` so the on-disk file and in-memory data are left untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Add tests for resharding replay idempotency Covers the paths that must not error or panic on replay: - `check_start_resharding` when matching state is already present - `start_resharding_unchecked` preserves matching state verbatim - `finish_resharding_unchecked` when state is already cleared Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * Simplify finish_resharding_unchecked closure Replace a manual `match` on `Option` with `as_ref().map(...)` to satisfy clippy::manual_map. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * test: add a test for resharding replay (#8839) * [ai] add a debug_assert on shard_number (#8846) * Update lib/collection/src/shards/shard_holder/resharding.rs Co-authored-by: Roman Titov <[email protected]> * Reformat --------- Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]> Co-authored-by: tellet-q <[email protected]> Co-authored-by: Roman Titov <[email protected]>

timvisee and others added 7 commits April 24, 2026 16:50

Simplify finish_resharding_unchecked closure

83bddc5

Replace a manual `match` on `Option` with `as_ref().map(...)` to satisfy clippy::manual_map. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

timvisee added the release:1.18.0 label Apr 24, 2026

github-actions Bot mentioned this pull request Apr 24, 2026

Flaky test hnsw_discover_test::filtered_hnsw_discover_precision #8704

Open

timvisee assigned ffuugoo Apr 24, 2026

timvisee requested review from agourlay, coszio, ffuugoo and generall April 24, 2026 15:13

timvisee self-assigned this Apr 24, 2026

timvisee requested a review from tellet-q April 24, 2026 15:16

tellet-q approved these changes Apr 27, 2026

View reviewed changes

timvisee marked this pull request as ready for review April 29, 2026 13:23

qdrant deleted a comment from coderabbitai Bot Apr 29, 2026

test: add a test for resharding replay (#8839)

5bfca31

This comment was marked as resolved.

Sign in to view

qdrant deleted a comment from coderabbitai Bot Apr 29, 2026

[ai] add a debug_assert on shard_number (#8846)

a52e6e5

This comment was marked as resolved.

Sign in to view

ffuugoo approved these changes Apr 30, 2026

View reviewed changes

Comment thread lib/collection/src/shards/shard_holder/resharding.rs Outdated

timvisee and others added 2 commits April 30, 2026 13:47

Update lib/collection/src/shards/shard_holder/resharding.rs

ecae489

Co-authored-by: Roman Titov <[email protected]>

Reformat

8f105de

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

timvisee merged commit 4702c5d into dev Apr 30, 2026
15 checks passed

timvisee deleted the resharding-shard-holder-idempotent branch April 30, 2026 12:20

This was referenced Apr 30, 2026

Flaky consensus_tests/test_resharding_extras.py::test_reshard_down_preconditions_custom_sharding #8854

Closed

Make shard holder and count changes atomic in resharding #8709

Closed

timvisee mentioned this pull request May 7, 2026

Fix inconsistent resharding state, SetShardState/AbortTransfer idempotency #8917

Merged

9 tasks

timvisee mentioned this pull request May 8, 2026

Bump version to 1.18.0 #8959

Merged

coderabbitai Bot mentioned this pull request May 28, 2026

Fix abort transfer resharding idempotency #9215

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make resharding operations (shard holder) idempotent#8789

Make resharding operations (shard holder) idempotent#8789
timvisee merged 11 commits into
devfrom
resharding-shard-holder-idempotent

timvisee commented Apr 24, 2026 •

edited

Loading

Uh oh!

tellet-q left a comment

Uh oh!

timvisee commented Apr 29, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

ffuugoo left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

timvisee commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

All Submissions:

New Feature Submissions:

Changes to Core Features:

Uh oh!

tellet-q left a comment

Choose a reason for hiding this comment

Uh oh!

timvisee commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

ffuugoo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

timvisee commented Apr 24, 2026 •

edited

Loading

timvisee commented Apr 29, 2026 •

edited

Loading