deep memory reporting by generall · Pull Request #8606 · qdrant/qdrant

generall · 2026-04-06T10:49:38Z

This PR exposes per-collection per-component memory usage:

GET /collections/<name>/memory

Each component reports following numbers about memory usage:

    /// Total bytes stored on disk (file sizes).
    pub disk_bytes: u64,
    /// Non-evictable heap RAM (in-memory data structures not backed by mmap).
    pub ram_bytes: u64,
    /// Evictable RAM: resident file pages from mmap (OS page cache).
    pub cached_bytes: u64,
    /// Bytes that should ideally be cached for best performance.
    /// Sum of file sizes for `Cached` intent files (mmap-accessed data).
    pub expected_cache_bytes: u64,

Disk cache is estimated using mincore. It opens all files of the component as a mmap and performs syscall to evaluate cache size.

It is planned for the future to re-use already opened mmaps from the pool, so we won't need to spend time opening and closing mmap if it is already opened.

ram_bytes is estimated with custom function for each component. It is the least accurate measurement in this API, as we can't take into account many factors:

data structures implemented in external packages (e.g. roaring bitmaps)
allocator overhead
minor components

This PR provides a manual test, which estimates error by comparing jemalloc reading before and after deleting components. Current results are:

====================================================================================================
Component                                   Estimated   Actual Freed        Error    Error %
----------------------------------------------------------------------------------------------------
payload_index/bool_field                    224.16 KB     -489.78 KB    713.94 KB     318.5%
payload_index/float_field                    23.08 MB       25.67 MB      2.59 MB      11.2%
payload_index/int_field                      64.97 MB       66.97 MB      2.01 MB       3.1%
payload_index/text_field                     12.59 MB       12.96 MB    383.46 KB       3.0%
payload_index/geo_field                     216.34 MB      236.04 MB     19.70 MB       9.1%
payload_index/keyword_field                  22.00 MB       23.95 MB      1.94 MB       8.8%
payload_index/datetime_field                 23.07 MB       25.59 MB      2.52 MB      10.9%
payload_index/uuid_field                     78.76 MB       85.23 MB      6.47 MB       8.2%
vector/dense                                      0 B        1.49 MB      1.49 MB       inf%
vector/dense-binary                          10.18 MB        9.14 MB      1.04 MB      10.2%
vector/dense-product                         42.21 MB       36.73 MB      5.48 MB      13.0%
vector/dense-scalar                          84.97 MB       74.30 MB     10.67 MB      12.6%
sparse_vector/sparse                         71.35 MB       84.64 MB     13.29 MB      18.6%
====================================================================================================

Total estimated freed: 649.73 MB
Total actual freed:   681.56 MB
Total error:          31.84 MB (4.9%)

Overall, I would expect around 10-15% ram usage underestimation from this API.

This API also fetches memory usage from all nodes in the cluster, so we avoid blinking problem in distributed deployments with round-robin load balancer.

coderabbitai

Actionable comments posted: 13

🧹 Nitpick comments (8)

lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs (1)

144-146: De-duplicate Document heap estimators to avoid drift.

Line [144] currently repeats the same computation as Line [121]. Consider delegating one to the other so future changes stay in sync.

♻️ Minimal refactor

 pub fn heap_size(&self) -> usize {
-    self.0.capacity() * std::mem::size_of::<TokenId>()
+    self.heap_bytes()
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`
around lines 144 - 146, The heap_size calculation in pub fn heap_size(&self) ->
usize duplicates the same capacity * size_of::<TokenId>() logic found in the
Document heap estimator; refactor by delegating one estimator to the other to
avoid drift—replace the duplicated arithmetic with a call to the existing
Document heap estimator (e.g., call the other struct/impl's heap_size method) so
both functions share a single implementation and future changes stay in sync;
keep references to TokenId and the original heap_size method names when locating
the code to update.

lib/segment/src/vector_storage/volatile_chunked_vectors.rs (1)

54-65: Include outer chunks Vec buffer in heap_size_bytes() for a more complete estimate.

Current estimate counts inner chunk payload buffers but not the heap backing array for self.chunks.

♻️ Suggested refinement

 pub fn heap_size_bytes(&self) -> usize {
@@
-    chunks
+    let outer_vec_bytes = chunks.capacity() * mem::size_of::<Vec<T>>();
+    outer_vec_bytes
+        + chunks
         .iter()
         .map(|chunk| chunk.capacity() * mem::size_of::<T>())
         .sum()
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/segment/src/vector_storage/volatile_chunked_vectors.rs` around lines 54 -
65, heap_size_bytes currently sums only inner chunk payloads; include the heap
backing buffer for the outer Vec by adding chunks.capacity() *
mem::size_of::<Vec<T>>() to the total. In the heap_size_bytes method (function
name heap_size_bytes) reference the struct field chunks and compute the outer
buffer size via chunks.capacity() * mem::size_of::<Vec<T>>() and add it to the
existing sum of inner chunk.capacity() * mem::size_of::<T>() so the returned
value accounts for both the outer Vec's heap array and each inner chunk's
payload.

lib/quantization/src/encoded_vectors_pq.rs (1)

643-653: Consider including metadata container allocations in this estimate.

Right now this counts inner centroid buffers but not the outer Vec<Vec<f32>> allocation (and vector_division allocation), which systematically underestimates heap usage.

♻️ Suggested refinement

 fn heap_size_bytes(&self) -> usize {
     let storage_heap = self.encoded_vectors.heap_size_bytes();
-    // PQ centroids: Vec<Vec<f32>> — each inner Vec holds dim floats
-    let centroids_heap: usize = self
+    let centroids_inner_heap: usize = self
         .metadata
         .centroids
         .iter()
         .map(|c| c.capacity() * std::mem::size_of::<f32>())
         .sum();
-    storage_heap + centroids_heap
+    let centroids_outer_heap =
+        self.metadata.centroids.capacity() * std::mem::size_of::<Vec<f32>>();
+    let vector_division_heap =
+        self.metadata.vector_division.capacity() * std::mem::size_of::<std::ops::Range<usize>>();
+
+    storage_heap + centroids_inner_heap + centroids_outer_heap + vector_division_heap
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/quantization/src/encoded_vectors_pq.rs` around lines 643 - 653, The
heap_size_bytes implementation only sums inner centroid buffers and storage but
misses heap allocations for the outer Vec<Vec<f32>> and any vector_division
container in metadata; update heap_size_bytes to also add
metadata.centroids.capacity() * std::mem::size_of::<Vec<f32>>() to account for
the outer Vec's heap, and add the heap cost of metadata.vector_division (or the
actual field name) as metadata.vector_division.capacity() *
std::mem::size_of::<T>() using the correct element type (e.g., usize or u32), so
the function (in encoded_vectors_pq::heap_size_bytes) returns storage_heap +
centroids_heap + outer_centroids_heap + vector_division_heap.

lib/collection/src/shards/local_shard/mod.rs (1)

1096-1105: Drop the segment-holder lock before running the probes.

This keeps segments.read() held across every per-segment memory_report() call, and those probes can be slow. Please clone the LockedSegments first, release the holder lock, then walk the cloned list so this endpoint does not block shard mutations for its full runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/shards/local_shard/mod.rs` around lines 1096 - 1105, The
current code holds the segments read lock across every per-segment probe
(segments.read() then iterating and calling
collect_memory_reports/locked_segment.memory_report()), causing long blocking of
shard mutations; fix by cloning the LockedSegment references (e.g., collect
locked_segment.clone() or clone Arc/handles from the iterator into a
Vec<LockedSegment>) immediately after calling segments.read(), drop the read
guard, and then iterate the cloned Vec to call collect_memory_reports /
memory_report, so the holder lock is released before running the slow probes.

tests/manual/test_memory_reporting.py (1)

148-151: Remove unnecessary f-prefix from string without placeholders.

✏️ Proposed fix

-    print(f"\nCollection has:")
+    print("\nCollection has:")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/manual/test_memory_reporting.py` around lines 148 - 151, The three
print statements in tests/manual/test_memory_reporting.py use f-strings without
placeholders (print(f"\nCollection has:"), print(f"  Indexed payload fields:
{indexed_fields}"), print(f"  Dense vector fields: {dense_vector_names}"),
print(f"  Sparse vector fields: {sparse_vector_names}")). Remove the unnecessary
f-prefix from the literal-only strings (e.g., change print(f"\nCollection has:")
to print("\nCollection has:")); keep the f-strings for lines that actually
interpolate variables (or convert all three to non-f-strings and use
concatenation/formatting consistently if desired). Ensure the changes target the
print calls referencing indexed_fields, dense_vector_names, and
sparse_vector_names.

lib/collection/src/common/memory_reporter.rs (3)

102-102: Optional: Consider using HashMap for O(1) lookups in merge.

The current implementation uses linear search (find) for each vector name during merge. With many shards or named vectors, this is O(n²). A HashMap<String, usize> mapping names to indices would provide O(1) lookups.

Given that named vectors per collection are typically few and this is a reporting endpoint (not a hot path), this is low priority.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/common/memory_reporter.rs` at line 102, The merge loop
uses result.vectors.iter_mut().find(|v| v.name == *name) causing O(n²) behavior;
change Merge logic in memory_reporter (the function where result.vectors is
built) to first build a HashMap<String, usize> mapping vector name to its index,
then use that map for O(1) lookups and update entries by index in result.vectors
instead of calling find each time; ensure the map is kept in sync when inserting
new named vectors into result.vectors.

221-229: Consider using a match for FileStorageIntent and logging metadata failures.

Two observations:

The if equality check won't surface new FileStorageIntent variants at compile time—a match would.
Silently ignoring files that fail metadata() differs from the Unix version which propagates errors. Consider logging a warning for debuggability.

Suggested change using match

     for entry in files {
         let ComponentFileEntry { path, intent } = &entry;
-        if let Ok(meta) = std::fs::metadata(path) {
-            disk_bytes += meta.len();
-            if *intent == FileStorageIntent::Cached {
-                expected_cache_bytes += meta.len();
+        match std::fs::metadata(path) {
+            Ok(meta) => {
+                disk_bytes += meta.len();
+                match intent {
+                    FileStorageIntent::Cached => {
+                        expected_cache_bytes += meta.len();
+                    }
+                    FileStorageIntent::Pinned | FileStorageIntent::OnDisk => {}
+                }
             }
+            Err(err) => {
+                log::debug!("Failed to get metadata for {}: {err}", path.display());
+            }
         }
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/common/memory_reporter.rs` around lines 221 - 229, Replace
the `if *intent == FileStorageIntent::Cached` check with a `match` on `intent`
so new `FileStorageIntent` variants are handled at compile time (locate the loop
iterating `files` and pattern `let ComponentFileEntry { path, intent }`). Also
change the `if let Ok(meta) = std::fs::metadata(path)` to explicitly handle the
`Err` case by logging a warning (include the `path` and error) instead of
silently ignoring failures; continue to update `disk_bytes` and
`expected_cache_bytes` based on `meta.len()` inside the successful branch.

105-109: Prefer explicit match arms over catch-all _.

Per coding guidelines, explicit exhaustive matches are preferred to ensure future changes don't silently fall through.

Suggested change

                     match (&mut existing.quantized, quantized) {
                         (Some(a), Some(b)) => a.merge(b),
                         (None, Some(b)) => existing.quantized = Some(b.clone()),
-                        _ => {}
+                        (Some(_), None) | (None, None) => {}
                     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/collection/src/common/memory_reporter.rs` around lines 105 - 109, The
match on (&mut existing.quantized, quantized) uses a catch-all `_ => {}`;
replace it with explicit arms for the remaining cases to satisfy exhaustive
matching: add (Some(_), None) => { /* no-op or comment */ } and (None, None) =>
{ /* no-op or comment */ } (or return/unreachable if appropriate) so the match
covers (Some(a), Some(b)), (None, Some(b)), (Some(_), None) and (None, None) for
existing.quantized and quantized in memory_reporter.rs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/collection/src/collection/mod.rs`:
- Around line 605-615: The current loop swallows shard errors and returns
CollectionMemoryReport::merge_all(reports) even if some shards failed; change
the logic in the async loop that awaits futures::StreamExt::next(&mut futures)
so that on the first Err(err) you propagate that error (return Err(err.into() /
appropriate error type) or convert it to the function's error type) instead of
just logging, ensuring callers don't receive silently partial totals;
alternatively, if partial results are desired, change the return type to include
a partial flag or Result<..., PartialCollectionMemoryReport> and set that flag
when any Err is observed—update the code paths that call
CollectionMemoryReport::merge_all and the function signature accordingly.

In `@lib/collection/src/shards/replica_set/mod.rs`:
- Around line 1461-1467: The current aggregation filters out remotes using
self.peer_is_updatable which under-reports memory because read-only replicas
still consume RAM; change the mapping so that all entries in remotes (from
remotes.read().await) are included when collecting futures and calling
remote.memory_report(), i.e., remove the .filter(|remote|
self.peer_is_updatable(remote.peer_id)) so try_join_all collects memory_report()
from every remote.
- Around line 1445-1450: The match on local.as_ref() currently uses a catch-all
`_` that returns CollectionMemoryReport::default(), which zeroes out
proxied-but-still-local shards during transfers; change the match to be
exhaustive by adding explicit arms for Shard::Proxy, Shard::ForwardProxy, and
Shard::QueueProxy and have each of those delegate to the underlying local
shard's memory_report (similar to the Shard::Local arm) instead of returning
default; keep the Shard::Local(local_shard) => local_shard.memory_report().await
arm and remove the `_` fallback so new enum variants must be handled explicitly.

In `@lib/segment/src/id_tracker/compressed/external_to_internal.rs`:
- Around line 238-241: The capacity computation underestimates bytes for BitVecs
because num_ids_removed.capacity() and uuids_removed.capacity() are in bits and
are currently divided with truncating integer division; replace those divisions
with a ceiling division (use div_ceil) so bits->bytes rounding matches other
code: change occurrences of num_ids_removed.capacity() / u8::BITS as usize and
uuids_removed.capacity() / u8::BITS as usize to
div_ceil(num_ids_removed.capacity(), u8::BITS as usize) and
div_ceil(uuids_removed.capacity(), u8::BITS as usize) respectively (and add the
appropriate import/trait for div_ceil if not already in scope).

In `@lib/segment/src/id_tracker/compressed/internal_to_external.rs`:
- Around line 114-117: The ram_usage_bytes function miscalculates bytes for
is_uuid because is_uuid.capacity() returns bits and dividing by 8 truncates;
update ram_usage_bytes (in internal_to_external.rs) to convert bit capacity to
bytes using ceiling division (e.g. (is_uuid.capacity() + 7) / 8 or a div_ceil(8)
helper) and ensure proper casting for u8::BITS so the final return is
data.capacity() * size_of::<u128>() + (is_uuid.capacity() + 7) / 8 (all usize).

In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs`:
- Around line 126-137: The cached_ram_usage_bytes is only computed at
initialization (assigned from compute_ram_usage_bytes) but never updated when
runtime mutations occur (e.g., remove_point increases
deleted_value_to_points_container and other structures), causing stale memory
reporting; update cached_ram_usage_bytes after any mutating operations that
change in-memory size (at minimum inside remove_point and any methods that
modify deleted_value_to_points_container, value_to_points_container,
point_to_values, or indexed_points) by invoking compute_ram_usage_bytes() and
assigning the result to cached_ram_usage_bytes so the reported RAM usage stays
accurate.
- Line 416: The calculation for deleted_bytes underestimates when capacity isn't
a multiple of 8; change the floor division used on
deleted_value_to_points_container.capacity() by u8::BITS to a ceiling division
so partial bytes are counted (e.g., compute deleted_bytes as
(deleted_value_to_points_container.capacity() + (u8::BITS as usize - 1)) /
u8::BITS as usize). Update the assignment that sets deleted_bytes to use this
ceil formula so all remaining bits map to a byte.

In `@lib/segment/src/index/field_index/map_index/mutable_map_index.rs`:
- Around line 323-344: The ram_usage_bytes() implementation undercounts heap
allocations for heap-backed <N as MapIndexKey>::Owned keys/values; update
ram_usage_bytes() (inside the ram_usage_bytes function which destructures map
and point_to_values) to iterate the map entries and add the dynamic heap size of
each Owned key (and any heap memory held by the corresponding bitmap/value if
applicable) in addition to the current sizeof-based estimates and
bitmap.serialized_size(), and likewise iterate point_to_values' Vecs to add the
heap allocation for each Owned element (not just Vec capacity * element size).
Use or add a helper/trait method (e.g., owned.heap_size_bytes() or similar) on
<N as MapIndexKey>::Owned to compute per-instance heap usage and call that for
every key/value when computing map_bytes and ptv_bytes so heap allocations are
included in the final sum.

In `@lib/segment/src/vector_storage/chunked_vectors.rs`:
- Around line 401-410: heap_size_bytes currently returns 0; replace that stub
with a real aggregation of heap usage by summing the heap sizes of the struct
fields: call heap_size_bytes() (or appropriate size helpers) on config, status,
and directory, and iterate over self.chunks (e.g., self.chunks.values()) to sum
each chunk's heap usage (or use size_of_val for simple container types); return
the total sum so heap_size_bytes reflects real RAM usage.

In `@lib/segment/src/vector_storage/dense/dense_vector_storage.rs`:
- Around line 229-230: The is_on_disk() method relies on the self.populated flag
which can become stale because update_from() reopens vectors with populate =
false without updating self.populated; update update_from() (the routine that
reopens/refreshes vectors) to explicitly reconcile and set self.populated to the
correct value based on the populate argument or the actual reopened storage
state (e.g., mark populated = true when vectors are populated, false when opened
unpopulated, or query the underlying storage status), so that is_on_disk()
always reflects the true storage intent.

In `@tests/manual/test_memory_reporting.py`:
- Around line 45-63: The three helper functions get_allocated_bytes,
get_memory_report, and get_collection_info call requests.get without timeouts
which can hang tests; modify each to pass a reasonable timeout (e.g., timeout=5
or a shared TIMEOUT constant) into requests.get and propagate that to all calls
(including params usage) so requests.raise_for_status still runs on timeout
errors; ensure you define or reuse a TIMEOUT symbol near BASE_URL if needed and
update all three functions to include the timeout argument.
- Around line 77-93: The DELETE helper functions delete_payload_index and
delete_vector_field currently call requests.delete without timeouts which can
hang tests; update both functions to pass a reasonable timeout (e.g.,
timeout=10) into the requests.delete call and keep existing params and
resp.raise_for_status() behavior so the request fails fast if the server doesn't
respond.
- Around line 109-115: The initial health-check and cleanup HTTP calls lack
timeouts and the status print uses an unnecessary f-string; update the
requests.get call that assigns resp (resp = requests.get(...)) and the
requests.delete call (requests.delete(...)) to include a short timeout (e.g.,
timeout=5) to avoid hanging tests, and remove the redundant f-prefix from the
print statement ("Qdrant is running.") so it is a plain string; locate these in
tests/manual/test_memory_reporting.py around the lines setting resp and calling
requests.delete.

---

Nitpick comments:
In `@lib/collection/src/common/memory_reporter.rs`:
- Line 102: The merge loop uses result.vectors.iter_mut().find(|v| v.name ==
*name) causing O(n²) behavior; change Merge logic in memory_reporter (the
function where result.vectors is built) to first build a HashMap<String, usize>
mapping vector name to its index, then use that map for O(1) lookups and update
entries by index in result.vectors instead of calling find each time; ensure the
map is kept in sync when inserting new named vectors into result.vectors.
- Around line 221-229: Replace the `if *intent == FileStorageIntent::Cached`
check with a `match` on `intent` so new `FileStorageIntent` variants are handled
at compile time (locate the loop iterating `files` and pattern `let
ComponentFileEntry { path, intent }`). Also change the `if let Ok(meta) =
std::fs::metadata(path)` to explicitly handle the `Err` case by logging a
warning (include the `path` and error) instead of silently ignoring failures;
continue to update `disk_bytes` and `expected_cache_bytes` based on `meta.len()`
inside the successful branch.
- Around line 105-109: The match on (&mut existing.quantized, quantized) uses a
catch-all `_ => {}`; replace it with explicit arms for the remaining cases to
satisfy exhaustive matching: add (Some(_), None) => { /* no-op or comment */ }
and (None, None) => { /* no-op or comment */ } (or return/unreachable if
appropriate) so the match covers (Some(a), Some(b)), (None, Some(b)), (Some(_),
None) and (None, None) for existing.quantized and quantized in
memory_reporter.rs.

In `@lib/collection/src/shards/local_shard/mod.rs`:
- Around line 1096-1105: The current code holds the segments read lock across
every per-segment probe (segments.read() then iterating and calling
collect_memory_reports/locked_segment.memory_report()), causing long blocking of
shard mutations; fix by cloning the LockedSegment references (e.g., collect
locked_segment.clone() or clone Arc/handles from the iterator into a
Vec<LockedSegment>) immediately after calling segments.read(), drop the read
guard, and then iterate the cloned Vec to call collect_memory_reports /
memory_report, so the holder lock is released before running the slow probes.

In `@lib/quantization/src/encoded_vectors_pq.rs`:
- Around line 643-653: The heap_size_bytes implementation only sums inner
centroid buffers and storage but misses heap allocations for the outer
Vec<Vec<f32>> and any vector_division container in metadata; update
heap_size_bytes to also add metadata.centroids.capacity() *
std::mem::size_of::<Vec<f32>>() to account for the outer Vec's heap, and add the
heap cost of metadata.vector_division (or the actual field name) as
metadata.vector_division.capacity() * std::mem::size_of::<T>() using the correct
element type (e.g., usize or u32), so the function (in
encoded_vectors_pq::heap_size_bytes) returns storage_heap + centroids_heap +
outer_centroids_heap + vector_division_heap.

In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`:
- Around line 144-146: The heap_size calculation in pub fn heap_size(&self) ->
usize duplicates the same capacity * size_of::<TokenId>() logic found in the
Document heap estimator; refactor by delegating one estimator to the other to
avoid drift—replace the duplicated arithmetic with a call to the existing
Document heap estimator (e.g., call the other struct/impl's heap_size method) so
both functions share a single implementation and future changes stay in sync;
keep references to TokenId and the original heap_size method names when locating
the code to update.

In `@lib/segment/src/vector_storage/volatile_chunked_vectors.rs`:
- Around line 54-65: heap_size_bytes currently sums only inner chunk payloads;
include the heap backing buffer for the outer Vec by adding chunks.capacity() *
mem::size_of::<Vec<T>>() to the total. In the heap_size_bytes method (function
name heap_size_bytes) reference the struct field chunks and compute the outer
buffer size via chunks.capacity() * mem::size_of::<Vec<T>>() and add it to the
existing sum of inner chunk.capacity() * mem::size_of::<T>() so the returned
value accounts for both the outer Vec's heap array and each inner chunk's
payload.

In `@tests/manual/test_memory_reporting.py`:
- Around line 148-151: The three print statements in
tests/manual/test_memory_reporting.py use f-strings without placeholders
(print(f"\nCollection has:"), print(f"  Indexed payload fields:
{indexed_fields}"), print(f"  Dense vector fields: {dense_vector_names}"),
print(f"  Sparse vector fields: {sparse_vector_names}")). Remove the unnecessary
f-prefix from the literal-only strings (e.g., change print(f"\nCollection has:")
to print("\nCollection has:")); keep the f-strings for lines that actually
interpolate variables (or convert all three to non-f-strings and use
concatenation/formatting consistently if desired). Ensure the changes target the
print calls referencing indexed_fields, dense_vector_names, and
sparse_vector_names.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 95259bf6-c4ce-4928-b8bd-20686e0f30a0

📥 Commits

Reviewing files that changed from the base of the PR and between 8226264 and 53ad8ac.

📒 Files selected for processing (76)

lib/api/build.rs
lib/api/src/grpc/proto/collections_internal_service.proto
lib/api/src/grpc/qdrant.rs
lib/collection/src/collection/mod.rs
lib/collection/src/common/memory_reporter.rs
lib/collection/src/common/mod.rs
lib/collection/src/shards/local_shard/mod.rs
lib/collection/src/shards/remote_shard.rs
lib/collection/src/shards/replica_set/mod.rs
lib/common/common/src/mmap/advice.rs
lib/common/common/src/universal_io/mmap.rs
lib/posting_list/src/posting_list.rs
lib/quantization/src/encoded_storage.rs
lib/quantization/src/encoded_vectors.rs
lib/quantization/src/encoded_vectors_binary.rs
lib/quantization/src/encoded_vectors_pq.rs
lib/quantization/src/encoded_vectors_u8.rs
lib/segment/src/common/memory_usage.rs
lib/segment/src/common/mod.rs
lib/segment/src/id_tracker/compressed/compressed_point_mappings.rs
lib/segment/src/id_tracker/compressed/external_to_internal.rs
lib/segment/src/id_tracker/compressed/internal_to_external.rs
lib/segment/src/id_tracker/compressed/versions_store.rs
lib/segment/src/id_tracker/immutable_id_tracker.rs
lib/segment/src/id_tracker/in_memory_id_tracker.rs
lib/segment/src/id_tracker/memory_reporter.rs
lib/segment/src/id_tracker/mod.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs
lib/segment/src/id_tracker/point_mappings.rs
lib/segment/src/index/field_index/bool_index/mod.rs
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
lib/segment/src/index/field_index/field_index_base.rs
lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs
lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/text_index.rs
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
lib/segment/src/index/field_index/geo_index/mod.rs
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
lib/segment/src/index/field_index/histogram.rs
lib/segment/src/index/field_index/immutable_point_to_values.rs
lib/segment/src/index/field_index/map_index/immutable_map_index.rs
lib/segment/src/index/field_index/map_index/mod.rs
lib/segment/src/index/field_index/map_index/mutable_map_index.rs
lib/segment/src/index/field_index/memory_reporter.rs
lib/segment/src/index/field_index/mod.rs
lib/segment/src/index/field_index/null_index/mutable_null_index.rs
lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
lib/segment/src/index/field_index/numeric_index/mod.rs
lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
lib/segment/src/index/memory_reporter.rs
lib/segment/src/index/mod.rs
lib/segment/src/payload_storage/memory_reporter.rs
lib/segment/src/payload_storage/mod.rs
lib/segment/src/segment/memory.rs
lib/segment/src/segment/mod.rs
lib/segment/src/types.rs
lib/segment/src/vector_storage/chunked_vectors.rs
lib/segment/src/vector_storage/common.rs
lib/segment/src/vector_storage/dense/dense_vector_storage.rs
lib/segment/src/vector_storage/memory_reporter.rs
lib/segment/src/vector_storage/mod.rs
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
lib/segment/src/vector_storage/volatile_chunked_vectors.rs
src/actix/api/collections_api.rs
src/tonic/api/collections_internal_api.rs
tests/manual/test_memory_reporting.py

Add `resident_bytes()`, `disk_bytes()`, and `probe_memory_stats()` methods to `MmapFile` for measuring page cache residency via `mincore(2)`. This is the foundation for per-collection memory usage reporting. Also extract `page_size()` as a public function in `mmap::advice`, replacing the internal `PAGE_SIZE_MASK` with a direct page size cache. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Add ram_usage_bytes() to all ID tracker types and their data structures: - PointMappings, CompressedPointMappings, CompressedVersions, CompressedInternalToExternal, CompressedExternalToInternal - MutableIdTracker, ImmutableIdTracker, InMemoryIdTracker All ID trackers load their data into RAM (none use mmap for working data). Files are reported as OnDisk (persistence only), actual RAM footprint is reported via extra_ram_bytes. Uses struct destructuring to ensure new fields trigger compile errors if not accounted for. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…torageIntent

…memory consumtion computation + caching

…ote shard

…+ shring volatile storage after load

timvisee

I've made some minor improvements to memory usage calculation, primarily to also include heap allocated structures themselves.

tests/manual/test_memory_reporting.py still shows similar performance.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

tests/manual/test_memory_reporting.py (1)

121-121: ⚠️ Potential issue | 🟡 Minor

Remove unnecessary f-prefix.

This f-string has no placeholders. Use a regular string instead.

-    print(f"\nUploading snapshot to Qdrant...")
+    print("\nUploading snapshot to Qdrant...")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/manual/test_memory_reporting.py` at line 121, The print call uses an
unnecessary f-string with no placeholders (print(f"\nUploading snapshot to
Qdrant...")), so replace it with a regular string literal by removing the
leading 'f' (i.e., change the print call in
tests/manual/test_memory_reporting.py that prints "\nUploading snapshot to
Qdrant..." to use print("\nUploading snapshot to Qdrant...")).

lib/segment/src/index/field_index/map_index/immutable_map_index.rs (1)

233-260: ⚠️ Potential issue | 🟠 Major

Refresh cached RAM after remove_point mutations.

cached_ram_usage_bytes is initialized once (Line 136) but not updated after Line 233–260 mutates in-memory containers, so reported RAM can become stale.

💡 Suggested fix

     pub fn remove_point(&mut self, idx: PointOffsetType) -> OperationResult<()> {
@@
         }
         self.point_to_values.remove_point(idx);
+        self.cached_ram_usage_bytes = self.compute_ram_usage_bytes();
         Ok(())
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs` around
lines 233 - 260, The remove_point method mutates in-memory containers
(value_to_points, value_to_points_container, deleted_value_to_points_container,
point_to_values, values_count, indexed_points) but never updates
cached_ram_usage_bytes, causing stale RAM reports; after performing the removals
and before returning from remove_point, recompute and assign the fresh RAM usage
to self.cached_ram_usage_bytes (by invoking the existing RAM-calculation helper
if one exists, or by adding a short inline recalculation) so
cached_ram_usage_bytes reflects the updated memory state.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/segment/src/index/field_index/immutable_point_to_values.rs`:
- Around line 47-55: The ram_usage_bytes implementation currently only sums
capacities * sizeof for PointValueEntry<N> and N, which misses heap-allocated
memory for types like String; update ram_usage_bytes (and/or add an overload) to
accept or use a heap-size estimator (e.g., a trait method like heap_size_bytes
or a passed-in closure) and include per-element heap usage by iterating
values_container and adding estimator(&value) for each element in addition to
the capacity * sizeof calculations; ensure PointValueEntry<N> and
values_container are the referenced symbols and document that heap-owning index
types should call this new estimator-enabled version (or implement the HeapSize
trait) so total RAM is reported correctly.

In `@tests/manual/test_memory_reporting.py`:
- Line 148: Remove the unused f-prefix from the print call so it is a regular
string literal; locate the print statement printing "\nCollection has:"
(currently written as print(f"\nCollection has:")) and change it to
print("\nCollection has:") to avoid an unnecessary f-string.

---

Duplicate comments:
In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs`:
- Around line 233-260: The remove_point method mutates in-memory containers
(value_to_points, value_to_points_container, deleted_value_to_points_container,
point_to_values, values_count, indexed_points) but never updates
cached_ram_usage_bytes, causing stale RAM reports; after performing the removals
and before returning from remove_point, recompute and assign the fresh RAM usage
to self.cached_ram_usage_bytes (by invoking the existing RAM-calculation helper
if one exists, or by adding a short inline recalculation) so
cached_ram_usage_bytes reflects the updated memory state.

In `@tests/manual/test_memory_reporting.py`:
- Line 121: The print call uses an unnecessary f-string with no placeholders
(print(f"\nUploading snapshot to Qdrant...")), so replace it with a regular
string literal by removing the leading 'f' (i.e., change the print call in
tests/manual/test_memory_reporting.py that prints "\nUploading snapshot to
Qdrant..." to use print("\nUploading snapshot to Qdrant...")).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1b0067d1-3c38-46e4-9d3c-16c7052aaf8d

📥 Commits

Reviewing files that changed from the base of the PR and between e85cc94 and 5d3893e.

📒 Files selected for processing (79)

lib/api/build.rs
lib/api/src/grpc/proto/collections_internal_service.proto
lib/api/src/grpc/qdrant.rs
lib/collection/src/collection/mod.rs
lib/collection/src/common/memory_reporter.rs
lib/collection/src/common/mod.rs
lib/collection/src/shards/forward_proxy_shard.rs
lib/collection/src/shards/local_shard/mod.rs
lib/collection/src/shards/proxy_shard.rs
lib/collection/src/shards/queue_proxy_shard.rs
lib/collection/src/shards/remote_shard.rs
lib/collection/src/shards/replica_set/mod.rs
lib/common/common/src/mmap/advice.rs
lib/common/common/src/universal_io/mmap.rs
lib/posting_list/src/posting_list.rs
lib/quantization/src/encoded_storage.rs
lib/quantization/src/encoded_vectors.rs
lib/quantization/src/encoded_vectors_binary.rs
lib/quantization/src/encoded_vectors_pq.rs
lib/quantization/src/encoded_vectors_u8.rs
lib/segment/src/common/memory_usage.rs
lib/segment/src/common/mod.rs
lib/segment/src/id_tracker/compressed/compressed_point_mappings.rs
lib/segment/src/id_tracker/compressed/external_to_internal.rs
lib/segment/src/id_tracker/compressed/internal_to_external.rs
lib/segment/src/id_tracker/compressed/versions_store.rs
lib/segment/src/id_tracker/immutable_id_tracker.rs
lib/segment/src/id_tracker/in_memory_id_tracker.rs
lib/segment/src/id_tracker/memory_reporter.rs
lib/segment/src/id_tracker/mod.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs
lib/segment/src/id_tracker/point_mappings.rs
lib/segment/src/index/field_index/bool_index/mod.rs
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
lib/segment/src/index/field_index/field_index_base.rs
lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs
lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/text_index.rs
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
lib/segment/src/index/field_index/geo_index/mod.rs
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
lib/segment/src/index/field_index/histogram.rs
lib/segment/src/index/field_index/immutable_point_to_values.rs
lib/segment/src/index/field_index/map_index/immutable_map_index.rs
lib/segment/src/index/field_index/map_index/mod.rs
lib/segment/src/index/field_index/map_index/mutable_map_index.rs
lib/segment/src/index/field_index/memory_reporter.rs
lib/segment/src/index/field_index/mod.rs
lib/segment/src/index/field_index/null_index/mutable_null_index.rs
lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
lib/segment/src/index/field_index/numeric_index/mod.rs
lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
lib/segment/src/index/memory_reporter.rs
lib/segment/src/index/mod.rs
lib/segment/src/payload_storage/memory_reporter.rs
lib/segment/src/payload_storage/mod.rs
lib/segment/src/segment/memory.rs
lib/segment/src/segment/mod.rs
lib/segment/src/types.rs
lib/segment/src/vector_storage/chunked_vectors.rs
lib/segment/src/vector_storage/common.rs
lib/segment/src/vector_storage/dense/dense_vector_storage.rs
lib/segment/src/vector_storage/memory_reporter.rs
lib/segment/src/vector_storage/mod.rs
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
lib/segment/src/vector_storage/volatile_chunked_vectors.rs
src/actix/api/collections_api.rs
src/tonic/api/collections_internal_api.rs
tests/manual/test_memory_reporting.py

✅ Files skipped from review due to trivial changes (10)

lib/segment/src/index/mod.rs
lib/segment/src/vector_storage/mod.rs
lib/segment/src/common/mod.rs
lib/segment/src/id_tracker/mod.rs
lib/segment/src/payload_storage/mod.rs
lib/collection/src/common/mod.rs
lib/collection/src/shards/queue_proxy_shard.rs
lib/segment/src/index/field_index/mod.rs
lib/segment/src/common/memory_usage.rs
lib/common/common/src/mmap/advice.rs

🚧 Files skipped from review as they are similar to previous changes (55)

lib/api/build.rs
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
lib/segment/src/segment/mod.rs
lib/quantization/src/encoded_vectors.rs
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
lib/segment/src/index/field_index/bool_index/mod.rs
lib/segment/src/index/field_index/null_index/mutable_null_index.rs
lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs
lib/segment/src/index/field_index/full_text_index/text_index.rs
lib/segment/src/index/field_index/numeric_index/mod.rs
lib/segment/src/index/field_index/histogram.rs
lib/segment/src/vector_storage/dense/dense_vector_storage.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
lib/segment/src/id_tracker/in_memory_id_tracker.rs
lib/segment/src/id_tracker/compressed/internal_to_external.rs
lib/segment/src/id_tracker/compressed/versions_store.rs
lib/posting_list/src/posting_list.rs
lib/quantization/src/encoded_vectors_u8.rs
lib/segment/src/index/field_index/memory_reporter.rs
lib/segment/src/vector_storage/chunked_vectors.rs
lib/quantization/src/encoded_storage.rs
lib/segment/src/id_tracker/memory_reporter.rs
lib/segment/src/payload_storage/memory_reporter.rs
lib/segment/src/id_tracker/point_mappings.rs
lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
lib/quantization/src/encoded_vectors_binary.rs
lib/segment/src/id_tracker/immutable_id_tracker.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
src/actix/api/collections_api.rs
lib/segment/src/index/field_index/map_index/mutable_map_index.rs
lib/segment/src/id_tracker/compressed/external_to_internal.rs
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
lib/collection/src/shards/proxy_shard.rs
lib/segment/src/vector_storage/volatile_chunked_vectors.rs
lib/collection/src/shards/local_shard/mod.rs
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
lib/segment/src/vector_storage/common.rs
lib/api/src/grpc/proto/collections_internal_service.proto
lib/segment/src/types.rs
lib/collection/src/shards/replica_set/mod.rs
lib/common/common/src/universal_io/mmap.rs
lib/segment/src/index/field_index/map_index/mod.rs
lib/segment/src/index/memory_reporter.rs
lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
lib/segment/src/index/field_index/geo_index/mod.rs
lib/collection/src/collection/mod.rs
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs

coderabbitai · 2026-04-14T09:29:28Z

+    /// Approximate RAM usage in bytes.
+    pub fn ram_usage_bytes(&self) -> usize {
+        let Self {
+            point_entries,
+            values_container,
+        } = self;
+        point_entries.capacity() * std::mem::size_of::<PointValueEntry<N>>()
+            + values_container.capacity() * std::mem::size_of::<N>()
+    }


⚠️ Potential issue | 🟠 Major

ram_usage_bytes undercounts heap-backed value types.

Line 53 and Line 54 only account for inline struct size. If N owns heap memory (notably String), this can significantly underreport ram_bytes in the new endpoint.

💡 Suggested direction

+ /// Approximate RAM usage in bytes, including optional per-value heap usage. + pub fn ram_usage_bytes_with(&self, heap_bytes: impl Fn(&N) -> usize) -> usize { + let base = self.point_entries.capacity() * std::mem::size_of::<PointValueEntry<N>>() + + self.values_container.capacity() * std::mem::size_of::<N>(); + + let point_entries_heap = self + .point_entries + .iter() + .map(|entry| match entry { + PointValueEntry::Single(v) => heap_bytes(v), + PointValueEntry::Slice { .. } => 0, + }) + .sum::<usize>(); + + let values_container_heap = self.values_container.iter().map(heap_bytes).sum::<usize>(); + + base + point_entries_heap + values_container_heap + }

Then call this from heap-owning index types (e.g. map index) with the proper heap estimator.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/index/field_index/immutable_point_to_values.rs` around lines 47 - 55, The ram_usage_bytes implementation currently only sums capacities * sizeof for PointValueEntry<N> and N, which misses heap-allocated memory for types like String; update ram_usage_bytes (and/or add an overload) to accept or use a heap-size estimator (e.g., a trait method like heap_size_bytes or a passed-in closure) and include per-element heap usage by iterating values_container and adding estimator(&value) for each element in addition to the capacity * sizeof calculations; ensure PointValueEntry<N> and values_container are the referenced symbols and document that heap-owning index types should call this new estimator-enabled version (or implement the HeapSize trait) so total RAM is reported correctly.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai · 2026-04-14T09:58:31Z

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive memory reporting capabilities to Qdrant. It adds a new GetShardMemoryReport gRPC RPC method and a corresponding REST endpoint for retrieving collection memory usage. The implementation includes memory tracking across vector storage (dense, sparse, quantized), payload storage, ID trackers, field indexes, and other segment components. New memory_report(), heap_size_bytes(), and ram_usage_bytes() methods are implemented across numerous storage and index types to estimate in-memory footprints. A new data structure hierarchy (ComponentMemoryUsage, FileStorageIntent, SegmentMemoryReport) organizes memory metrics by component and intent. A manual Python test validates memory reporting accuracy against runtime jemalloc telemetry.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

coszio
dancixx

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title 'deep memory reporting' clearly and directly summarizes the main change - adding comprehensive memory reporting capabilities to the Qdrant API.
Description check	✅ Passed	The pull request description is comprehensive and directly related to the changeset, explaining the new memory reporting endpoint, the metrics it provides (disk_bytes, ram_bytes, cached_bytes, expected_cache_bytes), methodology (mincore for cache estimation), limitations, and includes test results demonstrating accuracy.
Docstring Coverage	✅ Passed	Docstring coverage is 84.08% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch deep-memory-reporting

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

tests/manual/test_memory_reporting.py (3)

45-63: ⚠️ Potential issue | 🟡 Minor

Add timeouts to prevent indefinite hangs.

The helper functions get_allocated_bytes(), get_memory_report(), and get_collection_info() make HTTP requests without timeouts. If the server becomes unresponsive, the test will hang indefinitely.

Proposed fix

 def get_allocated_bytes() -> int:
     """Get jemalloc allocated_bytes from telemetry."""
-    resp = requests.get(f"{BASE_URL}/telemetry", params={"details_level": 2})
+    resp = requests.get(f"{BASE_URL}/telemetry", params={"details_level": 2}, timeout=30)
     resp.raise_for_status()
     return resp.json()["result"]["memory"]["allocated_bytes"]


 def get_memory_report(collection_name: str) -> dict:
     """Get the collection memory report."""
-    resp = requests.get(f"{BASE_URL}/collections/{collection_name}/memory")
+    resp = requests.get(f"{BASE_URL}/collections/{collection_name}/memory", timeout=30)
     resp.raise_for_status()
     return resp.json()["result"]


 def get_collection_info(collection_name: str) -> dict:
     """Get collection info."""
-    resp = requests.get(f"{BASE_URL}/collections/{collection_name}")
+    resp = requests.get(f"{BASE_URL}/collections/{collection_name}", timeout=30)
     resp.raise_for_status()
     return resp.json()["result"]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/manual/test_memory_reporting.py` around lines 45 - 63, The three
helpers get_allocated_bytes(), get_memory_report(), and get_collection_info()
call requests.get without a timeout which can cause tests to hang; update each
requests.get call to pass a sensible timeout (e.g., timeout=10 or a shared
constant like REQUEST_TIMEOUT_SECONDS) so the test fails fast on unresponsive
servers, preserving resp.raise_for_status() and the existing JSON parsing logic
in each function.

109-121: ⚠️ Potential issue | 🟡 Minor

Add timeouts and remove unnecessary f-prefix.

Lines 110 and 115 need timeouts, and line 121 has an f-string without placeholders.

Proposed fix

     # Verify server is up
-    resp = requests.get(f"{BASE_URL}/collections")
+    resp = requests.get(f"{BASE_URL}/collections", timeout=10)
     resp.raise_for_status()
     print("Qdrant is running.")

     # Clean up any leftover collection from a previous run
-    requests.delete(f"{BASE_URL}/collections/{COLLECTION_NAME}", params={"wait": "true"})
+    requests.delete(f"{BASE_URL}/collections/{COLLECTION_NAME}", params={"wait": "true"}, timeout=60)

     # Get snapshot from local cache (download once if needed)
     snapshot_path = get_cached_snapshot()

     # Upload snapshot to Qdrant
-    print(f"\nUploading snapshot to Qdrant...")
+    print("\nUploading snapshot to Qdrant...")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/manual/test_memory_reporting.py` around lines 109 - 121, The requests
calls lack timeouts and the print uses an unnecessary f-string; update the
requests.get and requests.delete calls to include a timeout parameter (e.g.,
timeout=10) to avoid hanging (modify the call sites in this snippet where
requests.get(...) and requests.delete(..., params=...) are used), and change
print(f"\nUploading snapshot to Qdrant...") to print("\nUploading snapshot to
Qdrant...") to remove the unused f-prefix; keep the existing params and behavior
of get_cached_snapshot() and resp.raise_for_status() unchanged.

77-93: ⚠️ Potential issue | 🟡 Minor

Add timeouts to delete operations.

The delete helper functions should also have timeouts to prevent hanging.

Proposed fix

 def delete_payload_index(collection_name: str, field_name: str):
     """Delete a payload index."""
     resp = requests.delete(
         f"{BASE_URL}/collections/{collection_name}/index/{field_name}",
         params={"wait": "true"},
+        timeout=120,
     )
     resp.raise_for_status()


 def delete_vector_field(collection_name: str, vector_name: str):
     """Delete a named vector field entirely."""
     resp = requests.delete(
         f"{BASE_URL}/collections/{collection_name}/vectors/{vector_name}",
         params={"wait": "true"},
+        timeout=120,
     )
     resp.raise_for_status()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/manual/test_memory_reporting.py` around lines 77 - 93, The two helper
functions delete_payload_index and delete_vector_field lack request timeouts and
can hang; update both to pass a timeout (e.g., timeout=10) to requests.delete
(same place where params is passed) so the HTTP call times out predictably, then
keep resp.raise_for_status() unchanged to surface errors.

🧹 Nitpick comments (2)

lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs (1)

144-146: Avoid duplicating heap-size logic in Document.

heap_size() currently re-implements the exact same formula as heap_bytes(). This can drift later; make one delegate to the other.

♻️ Proposed refactor

     pub fn heap_size(&self) -> usize {
-        self.0.capacity() * std::mem::size_of::<TokenId>()
+        self.heap_bytes()
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`
around lines 144 - 146, The heap_size() implementation duplicates the same
formula as heap_bytes(), so change heap_size() to delegate to the existing
heap_bytes() calculation instead of re-implementing it: replace the body of
heap_size() to simply return the result of heap_bytes() (ensuring types match,
i.e., usize) so only heap_bytes() contains the actual logic; update references
to TokenId only in heap_bytes() if needed and keep function names heap_size and
heap_bytes as the single source of truth.

lib/segment/src/payload_storage/memory_reporter.rs (1)

9-12: Avoid silent 0 fallback on RAM estimate failure.

At Line 11, swallowing errors with unwrap_or(0) can hide estimation failures and produce misleading reports. Consider logging the failure before fallback.

💡 Suggested adjustment

 impl MemoryReporter for PayloadStorageEnum {
     fn memory_usage(&self) -> ComponentMemoryUsage {
         match self {
             #[cfg(feature = "testing")]
             PayloadStorageEnum::InMemoryPayloadStorage(s) => {
-                // Purely in-memory, no files. Approximate RAM from storage size.
-                ComponentMemoryUsage::ram_only(s.get_storage_size_bytes().unwrap_or(0) as u64)
+                // Purely in-memory, no files. Approximate RAM from storage size.
+                let bytes = match s.get_storage_size_bytes() {
+                    Ok(v) => v as u64,
+                    Err(err) => {
+                        log::warn!("Failed to estimate in-memory payload storage size: {err}");
+                        0
+                    }
+                };
+                ComponentMemoryUsage::ram_only(bytes)
             }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/segment/src/payload_storage/memory_reporter.rs` around lines 9 - 12,
Replace the silent unwrap_or(0) in the
PayloadStorageEnum::InMemoryPayloadStorage branch: call
s.get_storage_size_bytes(), match its Result, and on Err log the error with
details (using the crate's logger/tracing) before falling back to 0; then pass
the numeric size into ComponentMemoryUsage::ram_only so failures are recorded
instead of silently producing 0.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/segment/src/vector_storage/quantized/quantized_vectors.rs`:
- Around line 2011-2028: The memory reporter for QuantizedVectors is hardcoding
FileStorageIntent::OnDisk; change it to use self.storage_impl.is_on_disk() to
choose intent (use FileStorageIntent::Cached when is_on_disk() is false,
FileStorageIntent::OnDisk when true) inside QuantizedVectors::memory_usage so
the call to ComponentMemoryUsage::from_files_and_ram and
ComponentMemoryUsage::from_files receives the correct intent; locate this logic
in the memory_usage method of the QuantizedVectors impl and replace the fixed
intent assignment with the conditional based on storage_impl.is_on_disk().

---

Duplicate comments:
In `@tests/manual/test_memory_reporting.py`:
- Around line 45-63: The three helpers get_allocated_bytes(),
get_memory_report(), and get_collection_info() call requests.get without a
timeout which can cause tests to hang; update each requests.get call to pass a
sensible timeout (e.g., timeout=10 or a shared constant like
REQUEST_TIMEOUT_SECONDS) so the test fails fast on unresponsive servers,
preserving resp.raise_for_status() and the existing JSON parsing logic in each
function.
- Around line 109-121: The requests calls lack timeouts and the print uses an
unnecessary f-string; update the requests.get and requests.delete calls to
include a timeout parameter (e.g., timeout=10) to avoid hanging (modify the call
sites in this snippet where requests.get(...) and requests.delete(...,
params=...) are used), and change print(f"\nUploading snapshot to Qdrant...") to
print("\nUploading snapshot to Qdrant...") to remove the unused f-prefix; keep
the existing params and behavior of get_cached_snapshot() and
resp.raise_for_status() unchanged.
- Around line 77-93: The two helper functions delete_payload_index and
delete_vector_field lack request timeouts and can hang; update both to pass a
timeout (e.g., timeout=10) to requests.delete (same place where params is
passed) so the HTTP call times out predictably, then keep
resp.raise_for_status() unchanged to surface errors.

---

Nitpick comments:
In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`:
- Around line 144-146: The heap_size() implementation duplicates the same
formula as heap_bytes(), so change heap_size() to delegate to the existing
heap_bytes() calculation instead of re-implementing it: replace the body of
heap_size() to simply return the result of heap_bytes() (ensuring types match,
i.e., usize) so only heap_bytes() contains the actual logic; update references
to TokenId only in heap_bytes() if needed and keep function names heap_size and
heap_bytes as the single source of truth.

In `@lib/segment/src/payload_storage/memory_reporter.rs`:
- Around line 9-12: Replace the silent unwrap_or(0) in the
PayloadStorageEnum::InMemoryPayloadStorage branch: call
s.get_storage_size_bytes(), match its Result, and on Err log the error with
details (using the crate's logger/tracing) before falling back to 0; then pass
the numeric size into ComponentMemoryUsage::ram_only so failures are recorded
instead of silently producing 0.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: acfc6c5a-e421-4ef2-87a0-38f1f688dac5

📥 Commits

Reviewing files that changed from the base of the PR and between e85cc94 and b0ab81b.

📒 Files selected for processing (79)

lib/api/build.rs
lib/api/src/grpc/proto/collections_internal_service.proto
lib/api/src/grpc/qdrant.rs
lib/collection/src/collection/mod.rs
lib/collection/src/common/memory_reporter.rs
lib/collection/src/common/mod.rs
lib/collection/src/shards/forward_proxy_shard.rs
lib/collection/src/shards/local_shard/mod.rs
lib/collection/src/shards/proxy_shard.rs
lib/collection/src/shards/queue_proxy_shard.rs
lib/collection/src/shards/remote_shard.rs
lib/collection/src/shards/replica_set/mod.rs
lib/common/common/src/mmap/advice.rs
lib/common/common/src/universal_io/mmap.rs
lib/posting_list/src/posting_list.rs
lib/quantization/src/encoded_storage.rs
lib/quantization/src/encoded_vectors.rs
lib/quantization/src/encoded_vectors_binary.rs
lib/quantization/src/encoded_vectors_pq.rs
lib/quantization/src/encoded_vectors_u8.rs
lib/segment/src/common/memory_usage.rs
lib/segment/src/common/mod.rs
lib/segment/src/id_tracker/compressed/compressed_point_mappings.rs
lib/segment/src/id_tracker/compressed/external_to_internal.rs
lib/segment/src/id_tracker/compressed/internal_to_external.rs
lib/segment/src/id_tracker/compressed/versions_store.rs
lib/segment/src/id_tracker/immutable_id_tracker.rs
lib/segment/src/id_tracker/in_memory_id_tracker.rs
lib/segment/src/id_tracker/memory_reporter.rs
lib/segment/src/id_tracker/mod.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs
lib/segment/src/id_tracker/point_mappings.rs
lib/segment/src/index/field_index/bool_index/mod.rs
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
lib/segment/src/index/field_index/field_index_base.rs
lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs
lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/text_index.rs
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
lib/segment/src/index/field_index/geo_index/mod.rs
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
lib/segment/src/index/field_index/histogram.rs
lib/segment/src/index/field_index/immutable_point_to_values.rs
lib/segment/src/index/field_index/map_index/immutable_map_index.rs
lib/segment/src/index/field_index/map_index/mod.rs
lib/segment/src/index/field_index/map_index/mutable_map_index.rs
lib/segment/src/index/field_index/memory_reporter.rs
lib/segment/src/index/field_index/mod.rs
lib/segment/src/index/field_index/null_index/mutable_null_index.rs
lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
lib/segment/src/index/field_index/numeric_index/mod.rs
lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
lib/segment/src/index/memory_reporter.rs
lib/segment/src/index/mod.rs
lib/segment/src/payload_storage/memory_reporter.rs
lib/segment/src/payload_storage/mod.rs
lib/segment/src/segment/memory.rs
lib/segment/src/segment/mod.rs
lib/segment/src/types.rs
lib/segment/src/vector_storage/chunked_vectors.rs
lib/segment/src/vector_storage/common.rs
lib/segment/src/vector_storage/dense/dense_vector_storage.rs
lib/segment/src/vector_storage/memory_reporter.rs
lib/segment/src/vector_storage/mod.rs
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
lib/segment/src/vector_storage/volatile_chunked_vectors.rs
src/actix/api/collections_api.rs
src/tonic/api/collections_internal_api.rs
tests/manual/test_memory_reporting.py

✅ Files skipped from review due to trivial changes (15)

lib/api/build.rs
lib/collection/src/common/mod.rs
lib/segment/src/index/field_index/mod.rs
lib/segment/src/id_tracker/compressed/internal_to_external.rs
lib/segment/src/id_tracker/mod.rs
lib/segment/src/vector_storage/mod.rs
lib/posting_list/src/posting_list.rs
lib/segment/src/index/field_index/histogram.rs
lib/segment/src/payload_storage/mod.rs
lib/segment/src/index/field_index/numeric_index/mod.rs
lib/segment/src/index/field_index/immutable_point_to_values.rs
lib/segment/src/vector_storage/chunked_vectors.rs
lib/segment/src/index/mod.rs
lib/segment/src/common/memory_usage.rs
lib/common/common/src/universal_io/mmap.rs

🚧 Files skipped from review as they are similar to previous changes (38)

lib/quantization/src/encoded_vectors.rs
lib/collection/src/shards/forward_proxy_shard.rs
lib/collection/src/shards/queue_proxy_shard.rs
lib/quantization/src/encoded_vectors_u8.rs
lib/api/src/grpc/proto/collections_internal_service.proto
lib/segment/src/id_tracker/in_memory_id_tracker.rs
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
lib/segment/src/index/field_index/bool_index/mod.rs
lib/segment/src/index/field_index/null_index/mutable_null_index.rs
lib/segment/src/id_tracker/point_mappings.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
lib/segment/src/id_tracker/compressed/external_to_internal.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs
lib/segment/src/segment/mod.rs
lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
lib/segment/src/vector_storage/dense/dense_vector_storage.rs
lib/collection/src/shards/local_shard/mod.rs
lib/collection/src/shards/remote_shard.rs
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
lib/segment/src/index/field_index/geo_index/mod.rs
lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
lib/common/common/src/mmap/advice.rs
lib/quantization/src/encoded_vectors_pq.rs
lib/segment/src/vector_storage/common.rs
lib/segment/src/types.rs
lib/segment/src/vector_storage/volatile_chunked_vectors.rs
lib/collection/src/shards/replica_set/mod.rs
lib/segment/src/id_tracker/memory_reporter.rs
src/tonic/api/collections_internal_api.rs
lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
lib/segment/src/index/field_index/map_index/immutable_map_index.rs
lib/segment/src/index/memory_reporter.rs
lib/segment/src/index/field_index/map_index/mod.rs
lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs

coderabbitai · 2026-04-14T09:58:34Z

+
+impl crate::common::memory_usage::MemoryReporter for QuantizedVectors {
+    fn memory_usage(&self) -> crate::common::memory_usage::ComponentMemoryUsage {
+        use crate::common::memory_usage::{ComponentMemoryUsage, FileStorageIntent};
+
+        let files = self.files();
+        let heap_bytes = self.storage_impl.heap_size_bytes() as u64;
+
+        // Either always_ram, then we only load on in ram and track heap_bytes
+        // Or full on_disk, and we don't preload anything
+        let intent = FileStorageIntent::OnDisk;
+
+        if heap_bytes > 0 {
+            ComponentMemoryUsage::from_files_and_ram(files, intent, heap_bytes)
+        } else {
+            ComponentMemoryUsage::from_files(files, intent)
+        }
+    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for how FileStorageIntent is used elsewhere to understand the expected behavior rg -n -A5 "FileStorageIntent" --type rust | head -100

Repository: qdrant/qdrant

Length of output: 8355

🏁 Script executed:

# Find the QuantizedVectors struct definition and its storage_impl field rg -n "struct QuantizedVectors" lib/segment/src/vector_storage/quantized/ -A 15

Repository: qdrant/qdrant

Length of output: 3163

🏁 Script executed:

# Search for storage variants and their characteristics (RAM vs MMAP) rg -n "enum QuantizedStorageVariant|QuantizedStorageImpl" lib/segment/src/vector_storage/quantized/ -A 5

Repository: qdrant/qdrant

Length of output: 39

🏁 Script executed:

# Check if there's an is_on_disk method or similar in QuantizedVectors rg -n "is_on_disk|OnDisk|Cached" lib/segment/src/vector_storage/quantized/quantized_vectors.rs -B 2 -A 2

Repository: qdrant/qdrant

Length of output: 3735

🏁 Script executed:

# Check the RAM storage variants' is_on_disk implementation rg -n "is_on_disk" lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs -B 2 -A 5

Repository: qdrant/qdrant

Length of output: 209

🏁 Script executed:

# Look at the vector_storage/memory_reporter.rs pattern for quantized storage sed -n '1,120p' lib/segment/src/vector_storage/memory_reporter.rs | cat -n

Repository: qdrant/qdrant

Length of output: 5588

🏁 Script executed:

# Check if QuantizedVectors uses the from_files_with_on_disk helper like others do rg -n "from_files_with_on_disk" lib/segment/src/vector_storage/

Repository: qdrant/qdrant

Length of output: 1468

🏁 Script executed:

# Check the heap_size_bytes implementation to understand the full picture rg -n "heap_size_bytes" lib/segment/src/vector_storage/quantized/ -B 2 -A 3

Repository: qdrant/qdrant

Length of output: 9509

Use self.storage_impl.is_on_disk() to determine the FileStorageIntent instead of hardcoding OnDisk.

The implementation should follow the pattern established in vector_storage/memory_reporter.rs: when is_on_disk() is false, the intent should be Cached (data was populated on load); when true, it should be OnDisk (data relies on OS page cache). RAM variants currently return is_on_disk() = false, meaning they are incorrectly reported as OnDisk instead of Cached.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/vector_storage/quantized/quantized_vectors.rs` around lines 2011 - 2028, The memory reporter for QuantizedVectors is hardcoding FileStorageIntent::OnDisk; change it to use self.storage_impl.is_on_disk() to choose intent (use FileStorageIntent::Cached when is_on_disk() is false, FileStorageIntent::OnDisk when true) inside QuantizedVectors::memory_usage so the call to ComponentMemoryUsage::from_files_and_ram and ComponentMemoryUsage::from_files receives the correct intent; locate this logic in the memory_usage method of the QuantizedVectors impl and replace the fixed intent assignment with the conditional based on storage_impl.is_on_disk().

PR #8606 changed `DenseVectorStorageImpl::is_on_disk()` from hard-coded `true` to `!self.populated`. The method is load-bearing beyond reporting: `quantized_vectors.rs::is_ram()` reads it to decide whether to materialize the quantized blob as `QuantizedRamStorage` (heap) or `QuantizedMmapStorage` (disk) for every quantized variant (scalar, pq, binary, turbo). For user configs with `on_disk: false` + quantization (no `always_ram`), the flip caused the quantized storage variant to move from disk to RAM, which in turn triggered segment rebuilds and transient 4x storage spikes during crasher-CI runs, hitting ENOSPC within 12-16 min vs. a stable 60 min on the parent commit. Restore `is_on_disk() -> true` and expose `populated()` as a separate accessor. Memory reporter now reads `populated()` to pick `FileStorageIntent::Cached` vs. `OnDisk` -- the distinction the original PR wanted to surface -- without changing quantization placement. Bisect: 4/4 green on aa3af83, 4/4 red on 5a899b7 (#8606).

…flip Drives the appendable -> immutable SegmentBuilder path that crasher's optimizer takes in production, with `on_disk: false` dense vector and binary quantization (`always_ram: Some(false)`). Asserts the resulting quantized storage is `BinaryMmap` (on disk), not `BinaryRam` (heap). The variant picked depends on `DenseVectorStorageImpl::is_on_disk()`; PR #8606 flipped it to `!populated`, which makes this test fail with `BinaryRam`. The fix in the previous commit restores the semantic and this test passes. Verified: fails before fix (BinaryRam), passes after (BinaryMmap).

* Add Monitor Collection Memory Usage page (v1.18.0) New page in Monitoring & Telemetry documenting the Web UI Memory tab and the GET /collections/{name}/memory API endpoint added in qdrant/qdrant#8606 and qdrant/qdrant-web-ui#372. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * Update web-ui docs link * Review feedback --------- Co-authored-by: Claude Sonnet 4.6 <[email protected]>

* Add mincore-based memory stats to MmapFile Add `resident_bytes()`, `disk_bytes()`, and `probe_memory_stats()` methods to `MmapFile` for measuring page cache residency via `mincore(2)`. This is the foundation for per-collection memory usage reporting. Also extract `page_size()` as a public function in `mmap::advice`, replacing the internal `PAGE_SIZE_MASK` with a direct page size cache. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * [AI] introduce trait for reporting memory usage per component * [AI] memory reporter implementation for vector storage * [AI] implement MemoryReporter for QuantizedVectors * [AI] implement MemoryReporter for VectorIndexEnum * Implement MemoryReporter for IdTrackerEnum with RAM estimation Add ram_usage_bytes() to all ID tracker types and their data structures: - PointMappings, CompressedPointMappings, CompressedVersions, CompressedInternalToExternal, CompressedExternalToInternal - MutableIdTracker, ImmutableIdTracker, InMemoryIdTracker All ID trackers load their data into RAM (none use mmap for working data). Files are reported as OnDisk (persistence only), actual RAM footprint is reported via extra_ram_bytes. Uses struct destructuring to ensure new fields trigger compile errors if not accounted for. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * [AI] implement MemoryReporter for PayloadStorageEnum and adjust FileStorageIntent * [AI] implement MemoryReporter for PayloadStorageEnum and adjust FileStorageIntent * [AI] implement MemoryReporter for payload indexes: in-ram structures memory consumtion computation + caching * [AI] implement MemoryReporter for payload indexes: in-ram structures memory consumtion computation + caching * [AI] segment-level memory usage report * [AI] Block 3: Aggregation Layer and Data Model + internal api for remote shard * [AI] REST API handler * fmt * [AI] clippy fixes * [AI] macos fix + proxy segment fix * [AI] make text index estimation a bit more correct * fix is_on_disk reporting for dense_vector_storage * fix after rebase * [AI] deep account for quantized vectors RAM usage + unify chunk size + shring volatile storage after load * remove debug log * cache in test * make manual test easier to run * rollback chunk size diff, but keep it for test only * review fixes * Use exhaustive match * Use div_ceil on bits everywhere It does not seem to be strictly necessary because the number of bits should already be a multiple of the used container size bytes. Still it's good practice to be careful with this calculation. * Improve heap size bytes for encoded product quantization vectors * Include vector stats for binary quantized vectors * In volatile chunked vectors, include heap allocated vector * Include rest of heap allocated structures for mutable map index * In mutable geo index, the hash map is also heap allocated * Update tests/manual/test_memory_reporting.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: timvisee <[email protected]> Co-authored-by: Tim Visée <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

generall force-pushed the deep-memory-reporting branch from 41cc321 to 3c68f50 Compare April 10, 2026 16:23

This was referenced Apr 13, 2026

clearing cache with pageout #8654

Merged

collection tab for viewing memory qdrant/qdrant-web-ui#372

Merged

generall requested a review from timvisee April 13, 2026 13:32

generall marked this pull request as ready for review April 13, 2026 13:33

coderabbitai Bot reviewed Apr 13, 2026

View reviewed changes

generall commented Apr 13, 2026

View reviewed changes

Comment thread lib/segment/src/vector_storage/quantized/quantized_vectors.rs Outdated

This comment was marked as resolved.

Sign in to view

generall and others added 22 commits April 14, 2026 11:05

[AI] introduce trait for reporting memory usage per component

425aa78

[AI] memory reporter implementation for vector storage

bbd69bf

[AI] implement MemoryReporter for QuantizedVectors

4a4c657

[AI] implement MemoryReporter for VectorIndexEnum

b970167

[AI] implement MemoryReporter for PayloadStorageEnum and adjust FileS…

4c94202

…torageIntent

[AI] implement MemoryReporter for PayloadStorageEnum and adjust FileS…

647a49d

…torageIntent

[AI] implement MemoryReporter for payload indexes: in-ram structures …

fbf3c70

…memory consumtion computation + caching

[AI] implement MemoryReporter for payload indexes: in-ram structures …

994b218

…memory consumtion computation + caching

[AI] segment-level memory usage report

1c5757d

[AI] Block 3: Aggregation Layer and Data Model + internal api for rem…

7e14451

…ote shard

[AI] REST API handler

1be5b05

fmt

141d597

[AI] clippy fixes

b80c789

[AI] macos fix + proxy segment fix

0dd09a9

[AI] make text index estimation a bit more correct

a75ba42

fix is_on_disk reporting for dense_vector_storage

83d7a31

fix after rebase

704f37a

[AI] deep account for quantized vectors RAM usage + unify chunk size …

5c45458

…+ shring volatile storage after load

remove debug log

470443c

cache in test

0ee3e5f

timvisee added 4 commits April 14, 2026 11:18

Include vector stats for binary quantized vectors

0a5a7ec

In volatile chunked vectors, include heap allocated vector

0101ada

Include rest of heap allocated structures for mutable map index

fa20649

In mutable geo index, the hash map is also heap allocated

5d3893e

timvisee force-pushed the deep-memory-reporting branch from e85cc94 to 5d3893e Compare April 14, 2026 09:20

qdrant deleted a comment from coderabbitai Bot Apr 14, 2026

timvisee approved these changes Apr 14, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

qdrant deleted a comment from coderabbitai Bot Apr 14, 2026

Update tests/manual/test_memory_reporting.py

b0ab81b

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

generall merged commit 5a899b7 into dev Apr 14, 2026
15 checks passed

generall deleted the deep-memory-reporting branch April 14, 2026 10:37

This was referenced Apr 15, 2026

Remove AsyncRawScorer #8685

Merged

low ram strict mode #8715

Merged

Low Memory mode #8714

Merged

agourlay mentioned this pull request Apr 20, 2026

fix(segment): restore DenseVectorStorageImpl::is_on_disk semantics #8733

Closed

coderabbitai Bot mentioned this pull request Apr 22, 2026

Immutable storage bool index #8625

Merged

9 tasks

abdonpijpelink mentioned this pull request Apr 23, 2026

Add Monitor Collection Memory Usage page (v1.18.0) qdrant/landing_page#2300

Merged

This was referenced Apr 23, 2026

Immutable storage for mmap numeric index #8594

Merged

refactor: split mutable_id_tracker.rs into a module #8795

Merged

Refactor quantized scorers for (future) io_uring support #8791

Merged

coderabbitai Bot mentioned this pull request Apr 30, 2026

split read only vector store #8852

Merged

timvisee mentioned this pull request May 8, 2026

Bump version to 1.18.0 #8959

Merged

coderabbitai Bot mentioned this pull request May 12, 2026

refactor(map_index): split mod.rs into read_ops, lifecycle, tests #9015

Merged

2 tasks

Conversation

generall commented Apr 6, 2026 • edited by timvisee Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

timvisee left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 14, 2026

Walkthrough

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

generall commented Apr 6, 2026 •

edited by timvisee

Loading