Actively migrate away from RocksDB based ID tracker by timvisee · Pull Request #6579 · qdrant/qdrant

timvisee · 2025-05-22T08:53:50Z

Actively migrate away from ID trackers that still use RocksDB. On segment load, it is replaced with our new mutable ID tracker.

The RocksDB ID tracker has been disabled for a long time. However, there may be very old deployments with a static collection that still use it. An optimization run on all segments was required to migrate it, with this change we will now actively migrate it on segment load.

This is part of a bigger effort to fully remove RocksDB from our code base.

The migration is behind a runtime feature flag (migrate_rocksdb_id_tracker). We'll likely enable it by default in one of the upcoming releases.

Tasks

Test

All Submissions:

Contributions should target the dev branch. Did you create your branch from dev?
Have you followed the guidelines in our Contributing document?
Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

Does your submission pass tests?
Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
Have you checked your code using cargo clippy --all --all-features command?

timvisee · 2025-05-22T08:54:49Z

+            let id_tracker = create_rocksdb_id_tracker(db_builder.require()?)?;
+
+            // Actively migrate RocksDB based ID tracker into mutable ID tracker
+            if feature_flags().migrate_rocksdb_id_tracker {
+                let id_tracker = migrate_rocksdb_id_tracker_to_mutable(id_tracker, segment_path)?;
+                return Ok(sp(IdTrackerEnum::MutableIdTracker(id_tracker)));
+            }
+
+            return Ok(sp(IdTrackerEnum::RocksDbIdTracker(id_tracker)));


Note: this is the only place left where we load the RocksDB based ID tracker. It is loaded if it still exists on disk.

Since this is the only call site, we can migrate the ID tracker here.

agourlay · 2025-05-23T08:10:44Z

+
+    // Copy all mappings into it
+    for (external_id, internal_id) in old_id_tracker.iter_from(None) {
+        let version = old_id_tracker.internal_version(internal_id).unwrap_or(0);


Why are we creating a synthetic version here?
If the version is missing it means the point is not well formed I believe.

Good question.

We can do two things here:

ignore the point not having a version

keep the point with the lowest possible version, a higher point version will have priority

I did pick the second approach. If there's a newer version of such point in another segment, it is used instead.

Mappings are always flushed before versions. If the point was just created and we abort flushing half way through, it may not have a version yet. Our WAL replay will then update this very point in-place and give it the correct version.

If we would ignore (delete) the point not having the version, our WAL replay would still put it back for us - but in a different place. So that would work as well.

If the point was just deleted we should not be able to get into this situation, because the mapping is dropped/flushed first.

Please correct me if you think my logic is wrong here. This is always quite convoluted. Thoughts?

agourlay · 2025-05-23T08:44:00Z

+
+            // Actively migrate RocksDB based ID tracker into mutable ID tracker
+            if common::flags::feature_flags().migrate_rocksdb_id_tracker {
+                let id_tracker = migrate_rocksdb_id_tracker_to_mutable(id_tracker, segment_path)?;


what happens if we fail here?

I assume the segment won't be loaded at all instead of keeping the current rocksdb id tracker.
Is it what we want?
Can this crash loop on restart?

Correct. The collection would fail to load.

The goal here is to be sure we don't leave anything RocksDB behind.

I'm not sure if simply chugging along on failure is better. If we'd log the problem, I'm not sure if it's ever seen. It would be a major problem if we drop the RocksDB dependency in the next release.

We could pick a phased approach instead:

now: if migration fails, still load old tracker and print a warning

in the next minor release: force migrate and crash if it fails

in the minor release after: fully drop RocksDB dependency

It would prolong the whole process of removing RocksDB though, which I'm not a fan of.

Alternatively we can force optimization here. If migration fails we keep the RocksDB index, then we force the optimizer to pick up all not-yet-migrated-segments which would also move segments away from the RocksDB ID tracker. Though this is less reliable, and it feels like a hack.

Thoughts?

Also possible: we can keep the feature flag and enable it by default, it would still crash on failure like you describe. But it would allow people to explicitly disable the flag to bypass the forced migration - if that is ever needed.

I am ok with migration failure, in this case, assuming we can always start previous version and it won't corrupt the storage in the middle of process

To improve it a bit I've added additional cleanup in 9e2313a

If migrating to the mutable ID tracker failed, it now cleans up the files it created as part of it.

* Actively migrate RocksDB ID tracker to new format on segment load * Extract ID tracker migration logic to function * Feature flag ID tracker migration * Simplify ID tracker migration by moving it deeper into load function * Move migrate function to the bottom * Add test to assert RocksDB to mutable ID tracker migration * Assert new mutable ID tracker is empty * Review remarks * On RocksDB to mutable ID tracker migration failure, clean up files * Demote empty mutable ID tracker to debug assertion * Copy all point versions, including deleted, set known mappings * read links and versions separatelly --------- Co-authored-by: generall <[email protected]>

timvisee added this to the Remove RocksDB milestone May 22, 2025

timvisee commented May 22, 2025

View reviewed changes

timvisee added 4 commits May 22, 2025 13:49

Actively migrate RocksDB ID tracker to new format on segment load

9e571c7

Extract ID tracker migration logic to function

70c6220

Feature flag ID tracker migration

b82d489

Simplify ID tracker migration by moving it deeper into load function

a8a75de

timvisee force-pushed the id-tracker-actively-migrate-rocksdb-to-mutable branch from db7b9f0 to a8a75de Compare May 22, 2025 11:52

Move migrate function to the bottom

7a55434

github-actions Bot mentioned this pull request May 22, 2025

Flaky test index::tests::hw_counter_test::test_hw_counter_for_plain_sparse_search #6231

Closed

timvisee added 2 commits May 22, 2025 15:07

Add test to assert RocksDB to mutable ID tracker migration

d2de32d

Assert new mutable ID tracker is empty

643643e

timvisee marked this pull request as ready for review May 22, 2025 13:10

timvisee requested review from JojiiOfficial, agourlay, Copilot and generall May 22, 2025 13:10

This comment was marked as resolved.

Sign in to view

JojiiOfficial approved these changes May 22, 2025

View reviewed changes

Review remarks

b7cba30

agourlay reviewed May 23, 2025

View reviewed changes

On RocksDB to mutable ID tracker migration failure, clean up files

9e2313a

timvisee force-pushed the id-tracker-actively-migrate-rocksdb-to-mutable branch from a1802b0 to 9e2313a Compare May 26, 2025 11:15

This comment was marked as resolved.

Sign in to view

generall reviewed May 26, 2025

View reviewed changes

Comment thread lib/segment/src/segment_constructor/segment_constructor_base.rs Outdated

Comment thread lib/segment/src/segment_constructor/segment_constructor_base.rs Outdated

Demote empty mutable ID tracker to debug assertion

709afb7

This comment was marked as resolved.

Sign in to view

Copy all point versions, including deleted, set known mappings

39461fa

timvisee requested a review from generall May 26, 2025 13:22

read links and versions separatelly

7610478

generall approved these changes May 26, 2025

View reviewed changes

agourlay approved these changes May 27, 2025

View reviewed changes

timvisee merged commit a39b54b into dev May 27, 2025
24 of 25 checks passed

timvisee deleted the id-tracker-actively-migrate-rocksdb-to-mutable branch May 27, 2025 09:36

This was referenced Apr 22, 2025

Tracking issue: mutable ID tracker without RocksDB #6157

Closed

Migrate away from RocksDB based dense vector storage #6603

Merged

This was referenced Jun 19, 2025

Fix vector storage migration error #6718

Merged

Actively migrate payload storage from RocksDB to Gridstore #6734

Merged

coderabbitai Bot mentioned this pull request Jul 15, 2025

Enable migration from RocksDB to mutable ID tracker by default #6872

Merged

6 tasks

coderabbitai Bot mentioned this pull request Oct 20, 2025

Deduplicate iter_internal and iter_ids in ID trackers #7428

Merged

3 tasks

coderabbitai Bot mentioned this pull request Feb 18, 2026

Use IdTrackerEnum type instead of dyn IdTracker #8168

Merged

9 tasks

coderabbitai Bot mentioned this pull request Mar 19, 2026

Explicit point_mappings guard in IdTracker #8261

Merged

9 tasks

coderabbitai Bot mentioned this pull request Mar 31, 2026

Drop RocksDB #8529

Merged

Conversation

timvisee commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

All Submissions:

New Feature Submissions:

Uh oh!

timvisee May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

agourlay May 23, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agourlay May 23, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

timvisee May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

generall May 26, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

timvisee commented May 22, 2025 •

edited

Loading

timvisee May 22, 2025 •

edited

Loading

timvisee May 23, 2025 •

edited

Loading

timvisee May 23, 2025 •

edited

Loading

timvisee May 23, 2025 •

edited

Loading

timvisee May 26, 2025 •

edited

Loading