Skip to content

Plan: in-place morph between Immutable and Mmap payload index variants#9114

Open
qdrant-cloud-bot wants to merge 1 commit into
devfrom
plan/payload-index-on-disk-morph
Open

Plan: in-place morph between Immutable and Mmap payload index variants#9114
qdrant-cloud-bot wants to merge 1 commit into
devfrom
plan/payload-index-on-disk-morph

Conversation

@qdrant-cloud-bot

Copy link
Copy Markdown
Contributor

Summary

Proposal-only PR. Adds a planning document under docs/plans/ describing how to skip the full payload-index rebuild when only the on_disk flag changes.

  • Current behavior: any change to PayloadFieldSchema (including just on_disk) hits drop_index_if_incompatiblewipe() → full rebuild from payload storage.
  • Observation: for mmap-backed segments the on-disk format is identical between on_disk: false (Immutable wrapper, derived in-RAM structures) and on_disk: true (Mmap/Storage wrapper, no derived structures). The existing low_memory_mode path already swaps between them at load time without rebuilding.
  • Proposal: add an Immutable ⇄ Mmap in-place morph and a SchemaTransition classifier so set_indexed takes the fast path when only on_disk differs.

See docs/plans/payload-index-on-disk-morph.md for the full plan: morph semantics, per-family implementation outline, classifier shape, wiring, phased rollout (numeric → map → geo → text), risks, and test plan.

Notes

  • docs/plans/ is gitignored — the file was force-added. Will be removed before merging the eventual implementation PR(s).
  • No code changes, no public API or persisted format changes proposed in this PR.

Test plan

  • N/A — documentation-only PR; intended for design review.

Made with Cursor

Proposal for avoiding a full payload index rebuild when only the
`on_disk` flag changes. The on-disk format is identical between the two
variants — only the in-memory wrapper differs — so an in-place swap is
safe and orders of magnitude cheaper than the current drop + rebuild.

This file lives under `docs/plans/` which is gitignored; it is added
here for review only and will be removed before merging the
implementation PR.

Co-authored-by: Cursor <[email protected]>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
docs/plans/payload-index-on-disk-morph.md (3)

54-60: 💤 Low value

Clarify the cost-benefit tradeoff for Mmap → Immutable morphing.

While the document correctly states this is cheaper than a full rebuild, the "one scan of the existing mmap files to populate derived structures" (line 57) could still be expensive for large indices. The claim of "Orders of magnitude cheaper" (line 60) depends on:

  • Size of the index vs size of payload storage
  • Complexity of derived structure construction vs full rebuild

Consider adding:

  • Rough quantitative guidance on when this optimization provides meaningful benefit (e.g., index size relative to payload storage)
  • Expected wall-clock time ranges for typical index sizes
  • Threshold guidance for when the scan cost might still be prohibitive

This will help implementers decide whether to add heuristics that skip the morph and fall through to rebuild for certain size ranges.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/payload-index-on-disk-morph.md` around lines 54 - 60, Clarify the
cost-benefit tradeoff for the "Mmap → Immutable" morph by expanding the text
around the Mmap → Immutable section to include quantitative guidance: state
expected wall-clock ranges (e.g., seconds/minutes/hours) for small/medium/large
index sizes, describe cost factors (index file size vs payload storage size and
derived-structure complexity), and add a recommended threshold heuristic (e.g.,
index size or scan-time estimate beyond which to prefer full rebuild) and
mention any warm-up options like madvise(WILLNEED)/populate() and the
ImmutableX::open_mmap(inner) scan step so implementers can use these numbers to
decide whether to morph or rebuild.

223-223: ⚡ Quick win

Clarify crash-recovery analysis for mid-morph crashes.

Line 223 states: "morph never modifies index files, only in-memory wrappers, and save_config() is the last step. On restart, the stored config is the pre-morph config and the files match it — fully consistent."

However, the flow in lines 166-168 shows:

if self.try_morph_on_disk(field, new_on_disk)? {
    self.update_config_only(field, &payload_schema)?;
    return Ok(());
}

If try_morph_on_disk succeeds (in-memory structures are now morphed) but the system crashes before or during update_config_onlysave_config(), the on-restart state is:

  • Index files: unchanged (correct)
  • Persisted config: old schema (pre-morph)
  • In-memory structures: don't exist (restart loads from disk)

On restart, the segment loads with the old config and builds in-memory structures according to that config, which is correct. So the analysis is sound, but the explanation could be clearer:

Suggested rewording:
"Morph never modifies index files, only in-memory wrappers. save_config() is the commit point. If we crash after morph succeeds but before save_config() persists the new schema, on restart the old schema is loaded and in-memory structures are rebuilt according to the old schema from the unchanged files — fully consistent. The morph will be retried on next WAL replay of the CreateIndex operation."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/payload-index-on-disk-morph.md` at line 223, Reword the
crash-recovery paragraph to explicitly state that try_morph_on_disk only changes
in-memory wrappers, that update_config_only (which calls save_config()) is the
durable commit point, and that if a crash occurs after try_morph_on_disk
succeeds but before update_config_only/save_config persists the new schema, on
restart the system will load the old persisted config and rebuild in-memory
structures from the unchanged index files; also add that the CreateIndex WAL
entry will cause the morph to be retried during replay. Include references to
try_morph_on_disk, update_config_only, save_config, and the CreateIndex WAL
replay to make the flow clear.

84-98: 💤 Low value

Clarify pseudo-code for morph_on_disk implementation.

The sketch has a few gaps that reduce clarity for design review:

  1. Line 86: "construct a tiny empty variant or use take()" is underspecified. Rust enums don't have default empty variants. Options include:

    • Using Option<Self> and Option::take()
    • Defining a minimal variant like Self::Placeholder (but this pollutes the enum)
    • Using a temporary valid variant (but which one?)
  2. Lines 88-94: The match arms cover Immutable ↔ Mmap transitions and the already_matching case, but the Mutable (Gridstore) case mentioned in the comment (lines 92-94) isn't shown explicitly. Should be:

    (Self::Mutable(m), _) => {
        *self = Self::Mutable(m);
        return Ok(false);
    }
  3. Line 101: Failure handling says "restore the original variant" but the current value was consumed by std::mem::replace. The parenthetical suggests wrapping the operation in a Result-returning closure, but this isn't reflected in the sketch.

Consider refining the pseudo-code to show a complete, compilable pattern (even if simplified) so reviewers can validate the approach end-to-end.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/payload-index-on-disk-morph.md` around lines 84 - 98, The
pseudo-code for NumericIndexInner::morph_on_disk is incomplete: replace the
placeholder approach with a safe take pattern (e.g., wrap self in Option<Self>
and use Option::take or std::mem::take) so you can move out the current variant
without inventing a dummy variant; add an explicit arm for the mutable case
(match (Self::Mutable(m), _) => { *self = Self::Mutable(m); return Ok(false); })
to restore and report not-applicable; and ensure error-prone conversions (like
ImmutableNumericIndex::open_mmap and imm.into_inner_mmap()) are performed inside
a Result-returning block so on Err you restore the original variant into *self
before returning the error, thereby avoiding lost state after std::mem::replace.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/plans/payload-index-on-disk-morph.md`:
- Line 182: Implement a clear partial-failure rollback in try_morph_on_disk: as
you iterate self.field_indexes[field] and call FieldIndex::morph_on_disk, record
successful indices (e.g., morphed_indices), and if any call returns Ok(false) or
Err(e) perform rollback by iterating morphed_indices in reverse and calling
morph_on_disk with the opposite direction (!new_on_disk) for each; if any
reverse morph fails, propagate that error (or at minimum log and return Err) so
the caller knows rollback didn't fully succeed; ensure you reference
try_morph_on_disk, FieldIndex, morph_on_disk and morphed_indices in the
implementation and return Ok(false) when a non-applicable variant is encountered
after successful rollbacks.
- Around line 148-184: Add a "Concurrency & Atomicity" subsection describing the
locking and failure semantics around morphing: state the current synchronization
points (e.g., whether StructPayloadIndex and SegmentEntry already hold any locks
when set_indexed() runs), document that try_morph_on_disk() iterates
field_indexes[] and calls morph_on_disk() and that update_config_only() calls
save_config() after recomputing types from get_full_index_type(), then specify
the desired atomicity (treat save_config() as the commit point or roll back
in-memory morph on save failure) and required locking (e.g., acquire a write
lock for morph operations on a field while readers use a read lock, serialize
concurrent set_indexed() calls per field/segment) and the expected behavior for
concurrent readers during Mmap→Immutable scans (block, snapshot, or allow stale
reads) so reviewers know whether existing locks suffice or new synchronization
is needed.

---

Nitpick comments:
In `@docs/plans/payload-index-on-disk-morph.md`:
- Around line 54-60: Clarify the cost-benefit tradeoff for the "Mmap →
Immutable" morph by expanding the text around the Mmap → Immutable section to
include quantitative guidance: state expected wall-clock ranges (e.g.,
seconds/minutes/hours) for small/medium/large index sizes, describe cost factors
(index file size vs payload storage size and derived-structure complexity), and
add a recommended threshold heuristic (e.g., index size or scan-time estimate
beyond which to prefer full rebuild) and mention any warm-up options like
madvise(WILLNEED)/populate() and the ImmutableX::open_mmap(inner) scan step so
implementers can use these numbers to decide whether to morph or rebuild.
- Line 223: Reword the crash-recovery paragraph to explicitly state that
try_morph_on_disk only changes in-memory wrappers, that update_config_only
(which calls save_config()) is the durable commit point, and that if a crash
occurs after try_morph_on_disk succeeds but before
update_config_only/save_config persists the new schema, on restart the system
will load the old persisted config and rebuild in-memory structures from the
unchanged index files; also add that the CreateIndex WAL entry will cause the
morph to be retried during replay. Include references to try_morph_on_disk,
update_config_only, save_config, and the CreateIndex WAL replay to make the flow
clear.
- Around line 84-98: The pseudo-code for NumericIndexInner::morph_on_disk is
incomplete: replace the placeholder approach with a safe take pattern (e.g.,
wrap self in Option<Self> and use Option::take or std::mem::take) so you can
move out the current variant without inventing a dummy variant; add an explicit
arm for the mutable case (match (Self::Mutable(m), _) => { *self =
Self::Mutable(m); return Ok(false); }) to restore and report not-applicable; and
ensure error-prone conversions (like ImmutableNumericIndex::open_mmap and
imm.into_inner_mmap()) are performed inside a Result-returning block so on Err
you restore the original variant into *self before returning the error, thereby
avoiding lost state after std::mem::replace.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 822d2ab4-8118-4e23-8871-e9e22489df27

📥 Commits

Reviewing files that changed from the base of the PR and between 97c612c and cf42604.

📒 Files selected for processing (1)
  • docs/plans/payload-index-on-disk-morph.md

Comment on lines +148 to +184

`lib/segment/src/index/struct_payload_index/payload_index.rs:61`:

```rust
fn set_indexed(&mut self, field, payload_schema, hw_counter) -> OperationResult<()> {
let payload_schema = payload_schema.into();

if let Some(prev) = self.config.indices.get(field).cloned() {
match classify(&prev.schema, &payload_schema) {
SchemaTransition::Identical => return Ok(()),
SchemaTransition::OnlyOnDiskFlipped { new_on_disk } => {
// Gridstore segments don't honor on_disk at storage layer.
// Update config so the next non-appendable segment built from
// this one (via optimizer) picks it up; nothing to morph now.
if matches!(self.storage_type, StorageType::GridstoreAppendable) {
self.update_config_only(field, &payload_schema)?;
return Ok(());
}
if self.try_morph_on_disk(field, new_on_disk)? {
self.update_config_only(field, &payload_schema)?;
return Ok(());
}
// morph failed — fall through to drop+rebuild
}
SchemaTransition::Incompatible => {}
}
}

// Existing path: drop_index_if_incompatible -> build_index -> apply_index
self.drop_index_if_incompatible(field, &payload_schema)?;
// ... unchanged
}
```

`try_morph_on_disk(field, new_on_disk)` walks `self.field_indexes[field]` and calls `morph_on_disk` on each entry. If any returns `Ok(false)` because the variant doesn't support morphing, restore the morphed ones and fall through to the legacy path.

`update_config_only` updates `self.config.indices[field]` to the new schema and re-computes `types` from the (now morphed) `FieldIndex`es' `get_full_index_type()` so the persisted `StorageType::Mmap { is_on_disk }` flag is up to date, then calls `save_config()`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add discussion of concurrency and atomicity guarantees.

The plan doesn't address critical concurrency and atomicity concerns:

Atomicity:

  • Line 167: update_config_only is called after morph succeeds, which calls save_config() (line 184).
  • If save_config() fails after in-memory morph succeeds, the in-memory state is morphed but the persisted config still has the old schema.
  • On restart, the segment loads with the old config but the in-memory structures don't match.
  • Mitigation: Should save_config() be the commit point? If it fails, should the morph be rolled back?

Concurrency:

  • What if multiple threads call set_indexed concurrently on the same or different fields?
  • What if queries are reading the index during the morph operation (especially Mmap → Immutable which scans)?
  • Are there existing locks at the StructPayloadIndex level that provide mutual exclusion?
  • Should the morph be behind a write lock while queries use read locks?

Add a "Concurrency & Atomicity" subsection to the risks section (or a new section) addressing:

  • Existing locking/synchronization guarantees at the StructPayloadIndex and SegmentEntry levels
  • Whether morph operations are naturally serialized by existing locks or need new synchronization
  • The atomicity contract between in-memory morph and save_config()
  • Impact on concurrent readers during morph (especially the scan-heavy Mmap → Immutable)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/payload-index-on-disk-morph.md` around lines 148 - 184, Add a
"Concurrency & Atomicity" subsection describing the locking and failure
semantics around morphing: state the current synchronization points (e.g.,
whether StructPayloadIndex and SegmentEntry already hold any locks when
set_indexed() runs), document that try_morph_on_disk() iterates field_indexes[]
and calls morph_on_disk() and that update_config_only() calls save_config()
after recomputing types from get_full_index_type(), then specify the desired
atomicity (treat save_config() as the commit point or roll back in-memory morph
on save failure) and required locking (e.g., acquire a write lock for morph
operations on a field while readers use a read lock, serialize concurrent
set_indexed() calls per field/segment) and the expected behavior for concurrent
readers during Mmap→Immutable scans (block, snapshot, or allow stale reads) so
reviewers know whether existing locks suffice or new synchronization is needed.

}
```

`try_morph_on_disk(field, new_on_disk)` walks `self.field_indexes[field]` and calls `morph_on_disk` on each entry. If any returns `Ok(false)` because the variant doesn't support morphing, restore the morphed ones and fall through to the legacy path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Specify the partial-failure rollback mechanism for try_morph_on_disk.

Line 182 describes try_morph_on_disk as walking multiple FieldIndex entries and morphing each, with a note that "if any returns Ok(false)... restore the morphed ones" (implied by line 182's description and line 222's risk mitigation). However, the rollback mechanism isn't specified:

  • How do you track which entries were successfully morphed?
  • Do you morph them back by calling morph_on_disk with the opposite direction? If so, what if the reverse morph fails?
  • Should the rollback happen in reverse order of the forward morph?

Consider adding a subsection or note clarifying:

fn try_morph_on_disk(&mut self, field: &str, new_on_disk: bool) -> OperationResult<bool> {
    let mut morphed_indices = Vec::new();
    for (idx, field_index) in self.field_indexes[field].iter_mut().enumerate() {
        match field_index.morph_on_disk(new_on_disk) {
            Ok(true) => morphed_indices.push(idx),
            Ok(false) => {
                // Not applicable; rollback and return false
                for idx in morphed_indices.iter().rev() {
                    self.field_indexes[field][*idx].morph_on_disk(!new_on_disk)?;
                }
                return Ok(false);
            }
            Err(e) => {
                // Morph failed; rollback and propagate error
                for idx in morphed_indices.iter().rev() {
                    self.field_indexes[field][*idx].morph_on_disk(!new_on_disk)?;
                }
                return Err(e);
            }
        }
    }
    Ok(true)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/plans/payload-index-on-disk-morph.md` at line 182, Implement a clear
partial-failure rollback in try_morph_on_disk: as you iterate
self.field_indexes[field] and call FieldIndex::morph_on_disk, record successful
indices (e.g., morphed_indices), and if any call returns Ok(false) or Err(e)
perform rollback by iterating morphed_indices in reverse and calling
morph_on_disk with the opposite direction (!new_on_disk) for each; if any
reverse morph fails, propagate that error (or at minimum log and return Err) so
the caller knows rollback didn't fully succeed; ensure you reference
try_morph_on_disk, FieldIndex, morph_on_disk and morphed_indices in the
implementation and return Ok(false) when a non-applicable variant is encountered
after successful rollbacks.

@qdrant qdrant deleted a comment from coderabbitai Bot May 21, 2026
@dancixx dancixx self-assigned this May 22, 2026
@dancixx dancixx marked this pull request as draft May 22, 2026 14:57
@dancixx dancixx marked this pull request as ready for review May 22, 2026 14:57
@dancixx dancixx mentioned this pull request May 22, 2026
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants