deep memory reporting#8606
Conversation
41cc321 to
3c68f50
Compare
There was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (8)
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs (1)
144-146: De-duplicateDocumentheap estimators to avoid drift.Line [144] currently repeats the same computation as Line [121]. Consider delegating one to the other so future changes stay in sync.
♻️ Minimal refactor
pub fn heap_size(&self) -> usize { - self.0.capacity() * std::mem::size_of::<TokenId>() + self.heap_bytes() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs` around lines 144 - 146, The heap_size calculation in pub fn heap_size(&self) -> usize duplicates the same capacity * size_of::<TokenId>() logic found in the Document heap estimator; refactor by delegating one estimator to the other to avoid drift—replace the duplicated arithmetic with a call to the existing Document heap estimator (e.g., call the other struct/impl's heap_size method) so both functions share a single implementation and future changes stay in sync; keep references to TokenId and the original heap_size method names when locating the code to update.lib/segment/src/vector_storage/volatile_chunked_vectors.rs (1)
54-65: Include outerchunksVec buffer inheap_size_bytes()for a more complete estimate.Current estimate counts inner chunk payload buffers but not the heap backing array for
self.chunks.♻️ Suggested refinement
pub fn heap_size_bytes(&self) -> usize { @@ - chunks + let outer_vec_bytes = chunks.capacity() * mem::size_of::<Vec<T>>(); + outer_vec_bytes + + chunks .iter() .map(|chunk| chunk.capacity() * mem::size_of::<T>()) .sum() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/vector_storage/volatile_chunked_vectors.rs` around lines 54 - 65, heap_size_bytes currently sums only inner chunk payloads; include the heap backing buffer for the outer Vec by adding chunks.capacity() * mem::size_of::<Vec<T>>() to the total. In the heap_size_bytes method (function name heap_size_bytes) reference the struct field chunks and compute the outer buffer size via chunks.capacity() * mem::size_of::<Vec<T>>() and add it to the existing sum of inner chunk.capacity() * mem::size_of::<T>() so the returned value accounts for both the outer Vec's heap array and each inner chunk's payload.lib/quantization/src/encoded_vectors_pq.rs (1)
643-653: Consider including metadata container allocations in this estimate.Right now this counts inner centroid buffers but not the outer
Vec<Vec<f32>>allocation (andvector_divisionallocation), which systematically underestimates heap usage.♻️ Suggested refinement
fn heap_size_bytes(&self) -> usize { let storage_heap = self.encoded_vectors.heap_size_bytes(); - // PQ centroids: Vec<Vec<f32>> — each inner Vec holds dim floats - let centroids_heap: usize = self + let centroids_inner_heap: usize = self .metadata .centroids .iter() .map(|c| c.capacity() * std::mem::size_of::<f32>()) .sum(); - storage_heap + centroids_heap + let centroids_outer_heap = + self.metadata.centroids.capacity() * std::mem::size_of::<Vec<f32>>(); + let vector_division_heap = + self.metadata.vector_division.capacity() * std::mem::size_of::<std::ops::Range<usize>>(); + + storage_heap + centroids_inner_heap + centroids_outer_heap + vector_division_heap }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/quantization/src/encoded_vectors_pq.rs` around lines 643 - 653, The heap_size_bytes implementation only sums inner centroid buffers and storage but misses heap allocations for the outer Vec<Vec<f32>> and any vector_division container in metadata; update heap_size_bytes to also add metadata.centroids.capacity() * std::mem::size_of::<Vec<f32>>() to account for the outer Vec's heap, and add the heap cost of metadata.vector_division (or the actual field name) as metadata.vector_division.capacity() * std::mem::size_of::<T>() using the correct element type (e.g., usize or u32), so the function (in encoded_vectors_pq::heap_size_bytes) returns storage_heap + centroids_heap + outer_centroids_heap + vector_division_heap.lib/collection/src/shards/local_shard/mod.rs (1)
1096-1105: Drop the segment-holder lock before running the probes.This keeps
segments.read()held across every per-segmentmemory_report()call, and those probes can be slow. Please clone theLockedSegments first, release the holder lock, then walk the cloned list so this endpoint does not block shard mutations for its full runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/collection/src/shards/local_shard/mod.rs` around lines 1096 - 1105, The current code holds the segments read lock across every per-segment probe (segments.read() then iterating and calling collect_memory_reports/locked_segment.memory_report()), causing long blocking of shard mutations; fix by cloning the LockedSegment references (e.g., collect locked_segment.clone() or clone Arc/handles from the iterator into a Vec<LockedSegment>) immediately after calling segments.read(), drop the read guard, and then iterate the cloned Vec to call collect_memory_reports / memory_report, so the holder lock is released before running the slow probes.tests/manual/test_memory_reporting.py (1)
148-151: Remove unnecessary f-prefix from string without placeholders.✏️ Proposed fix
- print(f"\nCollection has:") + print("\nCollection has:")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/manual/test_memory_reporting.py` around lines 148 - 151, The three print statements in tests/manual/test_memory_reporting.py use f-strings without placeholders (print(f"\nCollection has:"), print(f" Indexed payload fields: {indexed_fields}"), print(f" Dense vector fields: {dense_vector_names}"), print(f" Sparse vector fields: {sparse_vector_names}")). Remove the unnecessary f-prefix from the literal-only strings (e.g., change print(f"\nCollection has:") to print("\nCollection has:")); keep the f-strings for lines that actually interpolate variables (or convert all three to non-f-strings and use concatenation/formatting consistently if desired). Ensure the changes target the print calls referencing indexed_fields, dense_vector_names, and sparse_vector_names.lib/collection/src/common/memory_reporter.rs (3)
102-102: Optional: Consider usingHashMapfor O(1) lookups in merge.The current implementation uses linear search (
find) for each vector name during merge. With many shards or named vectors, this is O(n²). AHashMap<String, usize>mapping names to indices would provide O(1) lookups.Given that named vectors per collection are typically few and this is a reporting endpoint (not a hot path), this is low priority.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/collection/src/common/memory_reporter.rs` at line 102, The merge loop uses result.vectors.iter_mut().find(|v| v.name == *name) causing O(n²) behavior; change Merge logic in memory_reporter (the function where result.vectors is built) to first build a HashMap<String, usize> mapping vector name to its index, then use that map for O(1) lookups and update entries by index in result.vectors instead of calling find each time; ensure the map is kept in sync when inserting new named vectors into result.vectors.
221-229: Consider using a match forFileStorageIntentand logging metadata failures.Two observations:
- The
ifequality check won't surface newFileStorageIntentvariants at compile time—amatchwould.- Silently ignoring files that fail
metadata()differs from the Unix version which propagates errors. Consider logging a warning for debuggability.Suggested change using match
for entry in files { let ComponentFileEntry { path, intent } = &entry; - if let Ok(meta) = std::fs::metadata(path) { - disk_bytes += meta.len(); - if *intent == FileStorageIntent::Cached { - expected_cache_bytes += meta.len(); + match std::fs::metadata(path) { + Ok(meta) => { + disk_bytes += meta.len(); + match intent { + FileStorageIntent::Cached => { + expected_cache_bytes += meta.len(); + } + FileStorageIntent::Pinned | FileStorageIntent::OnDisk => {} + } } + Err(err) => { + log::debug!("Failed to get metadata for {}: {err}", path.display()); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/collection/src/common/memory_reporter.rs` around lines 221 - 229, Replace the `if *intent == FileStorageIntent::Cached` check with a `match` on `intent` so new `FileStorageIntent` variants are handled at compile time (locate the loop iterating `files` and pattern `let ComponentFileEntry { path, intent }`). Also change the `if let Ok(meta) = std::fs::metadata(path)` to explicitly handle the `Err` case by logging a warning (include the `path` and error) instead of silently ignoring failures; continue to update `disk_bytes` and `expected_cache_bytes` based on `meta.len()` inside the successful branch.
105-109: Prefer explicit match arms over catch-all_.Per coding guidelines, explicit exhaustive matches are preferred to ensure future changes don't silently fall through.
Suggested change
match (&mut existing.quantized, quantized) { (Some(a), Some(b)) => a.merge(b), (None, Some(b)) => existing.quantized = Some(b.clone()), - _ => {} + (Some(_), None) | (None, None) => {} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/collection/src/common/memory_reporter.rs` around lines 105 - 109, The match on (&mut existing.quantized, quantized) uses a catch-all `_ => {}`; replace it with explicit arms for the remaining cases to satisfy exhaustive matching: add (Some(_), None) => { /* no-op or comment */ } and (None, None) => { /* no-op or comment */ } (or return/unreachable if appropriate) so the match covers (Some(a), Some(b)), (None, Some(b)), (Some(_), None) and (None, None) for existing.quantized and quantized in memory_reporter.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/collection/src/collection/mod.rs`:
- Around line 605-615: The current loop swallows shard errors and returns
CollectionMemoryReport::merge_all(reports) even if some shards failed; change
the logic in the async loop that awaits futures::StreamExt::next(&mut futures)
so that on the first Err(err) you propagate that error (return Err(err.into() /
appropriate error type) or convert it to the function's error type) instead of
just logging, ensuring callers don't receive silently partial totals;
alternatively, if partial results are desired, change the return type to include
a partial flag or Result<..., PartialCollectionMemoryReport> and set that flag
when any Err is observed—update the code paths that call
CollectionMemoryReport::merge_all and the function signature accordingly.
In `@lib/collection/src/shards/replica_set/mod.rs`:
- Around line 1461-1467: The current aggregation filters out remotes using
self.peer_is_updatable which under-reports memory because read-only replicas
still consume RAM; change the mapping so that all entries in remotes (from
remotes.read().await) are included when collecting futures and calling
remote.memory_report(), i.e., remove the .filter(|remote|
self.peer_is_updatable(remote.peer_id)) so try_join_all collects memory_report()
from every remote.
- Around line 1445-1450: The match on local.as_ref() currently uses a catch-all
`_` that returns CollectionMemoryReport::default(), which zeroes out
proxied-but-still-local shards during transfers; change the match to be
exhaustive by adding explicit arms for Shard::Proxy, Shard::ForwardProxy, and
Shard::QueueProxy and have each of those delegate to the underlying local
shard's memory_report (similar to the Shard::Local arm) instead of returning
default; keep the Shard::Local(local_shard) => local_shard.memory_report().await
arm and remove the `_` fallback so new enum variants must be handled explicitly.
In `@lib/segment/src/id_tracker/compressed/external_to_internal.rs`:
- Around line 238-241: The capacity computation underestimates bytes for BitVecs
because num_ids_removed.capacity() and uuids_removed.capacity() are in bits and
are currently divided with truncating integer division; replace those divisions
with a ceiling division (use div_ceil) so bits->bytes rounding matches other
code: change occurrences of num_ids_removed.capacity() / u8::BITS as usize and
uuids_removed.capacity() / u8::BITS as usize to
div_ceil(num_ids_removed.capacity(), u8::BITS as usize) and
div_ceil(uuids_removed.capacity(), u8::BITS as usize) respectively (and add the
appropriate import/trait for div_ceil if not already in scope).
In `@lib/segment/src/id_tracker/compressed/internal_to_external.rs`:
- Around line 114-117: The ram_usage_bytes function miscalculates bytes for
is_uuid because is_uuid.capacity() returns bits and dividing by 8 truncates;
update ram_usage_bytes (in internal_to_external.rs) to convert bit capacity to
bytes using ceiling division (e.g. (is_uuid.capacity() + 7) / 8 or a div_ceil(8)
helper) and ensure proper casting for u8::BITS so the final return is
data.capacity() * size_of::<u128>() + (is_uuid.capacity() + 7) / 8 (all usize).
In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs`:
- Around line 126-137: The cached_ram_usage_bytes is only computed at
initialization (assigned from compute_ram_usage_bytes) but never updated when
runtime mutations occur (e.g., remove_point increases
deleted_value_to_points_container and other structures), causing stale memory
reporting; update cached_ram_usage_bytes after any mutating operations that
change in-memory size (at minimum inside remove_point and any methods that
modify deleted_value_to_points_container, value_to_points_container,
point_to_values, or indexed_points) by invoking compute_ram_usage_bytes() and
assigning the result to cached_ram_usage_bytes so the reported RAM usage stays
accurate.
- Line 416: The calculation for deleted_bytes underestimates when capacity isn't
a multiple of 8; change the floor division used on
deleted_value_to_points_container.capacity() by u8::BITS to a ceiling division
so partial bytes are counted (e.g., compute deleted_bytes as
(deleted_value_to_points_container.capacity() + (u8::BITS as usize - 1)) /
u8::BITS as usize). Update the assignment that sets deleted_bytes to use this
ceil formula so all remaining bits map to a byte.
In `@lib/segment/src/index/field_index/map_index/mutable_map_index.rs`:
- Around line 323-344: The ram_usage_bytes() implementation undercounts heap
allocations for heap-backed <N as MapIndexKey>::Owned keys/values; update
ram_usage_bytes() (inside the ram_usage_bytes function which destructures map
and point_to_values) to iterate the map entries and add the dynamic heap size of
each Owned key (and any heap memory held by the corresponding bitmap/value if
applicable) in addition to the current sizeof-based estimates and
bitmap.serialized_size(), and likewise iterate point_to_values' Vecs to add the
heap allocation for each Owned element (not just Vec capacity * element size).
Use or add a helper/trait method (e.g., owned.heap_size_bytes() or similar) on
<N as MapIndexKey>::Owned to compute per-instance heap usage and call that for
every key/value when computing map_bytes and ptv_bytes so heap allocations are
included in the final sum.
In `@lib/segment/src/vector_storage/chunked_vectors.rs`:
- Around line 401-410: heap_size_bytes currently returns 0; replace that stub
with a real aggregation of heap usage by summing the heap sizes of the struct
fields: call heap_size_bytes() (or appropriate size helpers) on config, status,
and directory, and iterate over self.chunks (e.g., self.chunks.values()) to sum
each chunk's heap usage (or use size_of_val for simple container types); return
the total sum so heap_size_bytes reflects real RAM usage.
In `@lib/segment/src/vector_storage/dense/dense_vector_storage.rs`:
- Around line 229-230: The is_on_disk() method relies on the self.populated flag
which can become stale because update_from() reopens vectors with populate =
false without updating self.populated; update update_from() (the routine that
reopens/refreshes vectors) to explicitly reconcile and set self.populated to the
correct value based on the populate argument or the actual reopened storage
state (e.g., mark populated = true when vectors are populated, false when opened
unpopulated, or query the underlying storage status), so that is_on_disk()
always reflects the true storage intent.
In `@tests/manual/test_memory_reporting.py`:
- Around line 45-63: The three helper functions get_allocated_bytes,
get_memory_report, and get_collection_info call requests.get without timeouts
which can hang tests; modify each to pass a reasonable timeout (e.g., timeout=5
or a shared TIMEOUT constant) into requests.get and propagate that to all calls
(including params usage) so requests.raise_for_status still runs on timeout
errors; ensure you define or reuse a TIMEOUT symbol near BASE_URL if needed and
update all three functions to include the timeout argument.
- Around line 77-93: The DELETE helper functions delete_payload_index and
delete_vector_field currently call requests.delete without timeouts which can
hang tests; update both functions to pass a reasonable timeout (e.g.,
timeout=10) into the requests.delete call and keep existing params and
resp.raise_for_status() behavior so the request fails fast if the server doesn't
respond.
- Around line 109-115: The initial health-check and cleanup HTTP calls lack
timeouts and the status print uses an unnecessary f-string; update the
requests.get call that assigns resp (resp = requests.get(...)) and the
requests.delete call (requests.delete(...)) to include a short timeout (e.g.,
timeout=5) to avoid hanging tests, and remove the redundant f-prefix from the
print statement ("Qdrant is running.") so it is a plain string; locate these in
tests/manual/test_memory_reporting.py around the lines setting resp and calling
requests.delete.
---
Nitpick comments:
In `@lib/collection/src/common/memory_reporter.rs`:
- Line 102: The merge loop uses result.vectors.iter_mut().find(|v| v.name ==
*name) causing O(n²) behavior; change Merge logic in memory_reporter (the
function where result.vectors is built) to first build a HashMap<String, usize>
mapping vector name to its index, then use that map for O(1) lookups and update
entries by index in result.vectors instead of calling find each time; ensure the
map is kept in sync when inserting new named vectors into result.vectors.
- Around line 221-229: Replace the `if *intent == FileStorageIntent::Cached`
check with a `match` on `intent` so new `FileStorageIntent` variants are handled
at compile time (locate the loop iterating `files` and pattern `let
ComponentFileEntry { path, intent }`). Also change the `if let Ok(meta) =
std::fs::metadata(path)` to explicitly handle the `Err` case by logging a
warning (include the `path` and error) instead of silently ignoring failures;
continue to update `disk_bytes` and `expected_cache_bytes` based on `meta.len()`
inside the successful branch.
- Around line 105-109: The match on (&mut existing.quantized, quantized) uses a
catch-all `_ => {}`; replace it with explicit arms for the remaining cases to
satisfy exhaustive matching: add (Some(_), None) => { /* no-op or comment */ }
and (None, None) => { /* no-op or comment */ } (or return/unreachable if
appropriate) so the match covers (Some(a), Some(b)), (None, Some(b)), (Some(_),
None) and (None, None) for existing.quantized and quantized in
memory_reporter.rs.
In `@lib/collection/src/shards/local_shard/mod.rs`:
- Around line 1096-1105: The current code holds the segments read lock across
every per-segment probe (segments.read() then iterating and calling
collect_memory_reports/locked_segment.memory_report()), causing long blocking of
shard mutations; fix by cloning the LockedSegment references (e.g., collect
locked_segment.clone() or clone Arc/handles from the iterator into a
Vec<LockedSegment>) immediately after calling segments.read(), drop the read
guard, and then iterate the cloned Vec to call collect_memory_reports /
memory_report, so the holder lock is released before running the slow probes.
In `@lib/quantization/src/encoded_vectors_pq.rs`:
- Around line 643-653: The heap_size_bytes implementation only sums inner
centroid buffers and storage but misses heap allocations for the outer
Vec<Vec<f32>> and any vector_division container in metadata; update
heap_size_bytes to also add metadata.centroids.capacity() *
std::mem::size_of::<Vec<f32>>() to account for the outer Vec's heap, and add the
heap cost of metadata.vector_division (or the actual field name) as
metadata.vector_division.capacity() * std::mem::size_of::<T>() using the correct
element type (e.g., usize or u32), so the function (in
encoded_vectors_pq::heap_size_bytes) returns storage_heap + centroids_heap +
outer_centroids_heap + vector_division_heap.
In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`:
- Around line 144-146: The heap_size calculation in pub fn heap_size(&self) ->
usize duplicates the same capacity * size_of::<TokenId>() logic found in the
Document heap estimator; refactor by delegating one estimator to the other to
avoid drift—replace the duplicated arithmetic with a call to the existing
Document heap estimator (e.g., call the other struct/impl's heap_size method) so
both functions share a single implementation and future changes stay in sync;
keep references to TokenId and the original heap_size method names when locating
the code to update.
In `@lib/segment/src/vector_storage/volatile_chunked_vectors.rs`:
- Around line 54-65: heap_size_bytes currently sums only inner chunk payloads;
include the heap backing buffer for the outer Vec by adding chunks.capacity() *
mem::size_of::<Vec<T>>() to the total. In the heap_size_bytes method (function
name heap_size_bytes) reference the struct field chunks and compute the outer
buffer size via chunks.capacity() * mem::size_of::<Vec<T>>() and add it to the
existing sum of inner chunk.capacity() * mem::size_of::<T>() so the returned
value accounts for both the outer Vec's heap array and each inner chunk's
payload.
In `@tests/manual/test_memory_reporting.py`:
- Around line 148-151: The three print statements in
tests/manual/test_memory_reporting.py use f-strings without placeholders
(print(f"\nCollection has:"), print(f" Indexed payload fields:
{indexed_fields}"), print(f" Dense vector fields: {dense_vector_names}"),
print(f" Sparse vector fields: {sparse_vector_names}")). Remove the unnecessary
f-prefix from the literal-only strings (e.g., change print(f"\nCollection has:")
to print("\nCollection has:")); keep the f-strings for lines that actually
interpolate variables (or convert all three to non-f-strings and use
concatenation/formatting consistently if desired). Ensure the changes target the
print calls referencing indexed_fields, dense_vector_names, and
sparse_vector_names.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 95259bf6-c4ce-4928-b8bd-20686e0f30a0
📒 Files selected for processing (76)
lib/api/build.rslib/api/src/grpc/proto/collections_internal_service.protolib/api/src/grpc/qdrant.rslib/collection/src/collection/mod.rslib/collection/src/common/memory_reporter.rslib/collection/src/common/mod.rslib/collection/src/shards/local_shard/mod.rslib/collection/src/shards/remote_shard.rslib/collection/src/shards/replica_set/mod.rslib/common/common/src/mmap/advice.rslib/common/common/src/universal_io/mmap.rslib/posting_list/src/posting_list.rslib/quantization/src/encoded_storage.rslib/quantization/src/encoded_vectors.rslib/quantization/src/encoded_vectors_binary.rslib/quantization/src/encoded_vectors_pq.rslib/quantization/src/encoded_vectors_u8.rslib/segment/src/common/memory_usage.rslib/segment/src/common/mod.rslib/segment/src/id_tracker/compressed/compressed_point_mappings.rslib/segment/src/id_tracker/compressed/external_to_internal.rslib/segment/src/id_tracker/compressed/internal_to_external.rslib/segment/src/id_tracker/compressed/versions_store.rslib/segment/src/id_tracker/immutable_id_tracker.rslib/segment/src/id_tracker/in_memory_id_tracker.rslib/segment/src/id_tracker/memory_reporter.rslib/segment/src/id_tracker/mod.rslib/segment/src/id_tracker/mutable_id_tracker.rslib/segment/src/id_tracker/point_mappings.rslib/segment/src/index/field_index/bool_index/mod.rslib/segment/src/index/field_index/bool_index/mutable_bool_index.rslib/segment/src/index/field_index/field_index_base.rslib/segment/src/index/field_index/full_text_index/immutable_text_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rslib/segment/src/index/field_index/full_text_index/inverted_index/mod.rslib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rslib/segment/src/index/field_index/full_text_index/mmap_text_index.rslib/segment/src/index/field_index/full_text_index/mutable_text_index.rslib/segment/src/index/field_index/full_text_index/text_index.rslib/segment/src/index/field_index/geo_index/immutable_geo_index.rslib/segment/src/index/field_index/geo_index/mod.rslib/segment/src/index/field_index/geo_index/mutable_geo_index.rslib/segment/src/index/field_index/histogram.rslib/segment/src/index/field_index/immutable_point_to_values.rslib/segment/src/index/field_index/map_index/immutable_map_index.rslib/segment/src/index/field_index/map_index/mod.rslib/segment/src/index/field_index/map_index/mutable_map_index.rslib/segment/src/index/field_index/memory_reporter.rslib/segment/src/index/field_index/mod.rslib/segment/src/index/field_index/null_index/mutable_null_index.rslib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rslib/segment/src/index/field_index/numeric_index/mod.rslib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rslib/segment/src/index/memory_reporter.rslib/segment/src/index/mod.rslib/segment/src/payload_storage/memory_reporter.rslib/segment/src/payload_storage/mod.rslib/segment/src/segment/memory.rslib/segment/src/segment/mod.rslib/segment/src/types.rslib/segment/src/vector_storage/chunked_vectors.rslib/segment/src/vector_storage/common.rslib/segment/src/vector_storage/dense/dense_vector_storage.rslib/segment/src/vector_storage/memory_reporter.rslib/segment/src/vector_storage/mod.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_ram_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/volatile_chunked_vectors.rssrc/actix/api/collections_api.rssrc/tonic/api/collections_internal_api.rstests/manual/test_memory_reporting.py
Add `resident_bytes()`, `disk_bytes()`, and `probe_memory_stats()` methods to `MmapFile` for measuring page cache residency via `mincore(2)`. This is the foundation for per-collection memory usage reporting. Also extract `page_size()` as a public function in `mmap::advice`, replacing the internal `PAGE_SIZE_MASK` with a direct page size cache. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add ram_usage_bytes() to all ID tracker types and their data structures: - PointMappings, CompressedPointMappings, CompressedVersions, CompressedInternalToExternal, CompressedExternalToInternal - MutableIdTracker, ImmutableIdTracker, InMemoryIdTracker All ID trackers load their data into RAM (none use mmap for working data). Files are reported as OnDisk (persistence only), actual RAM footprint is reported via extra_ram_bytes. Uses struct destructuring to ensure new fields trigger compile errors if not accounted for. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…memory consumtion computation + caching
…memory consumtion computation + caching
…+ shring volatile storage after load
e85cc94 to
5d3893e
Compare
timvisee
left a comment
There was a problem hiding this comment.
I've made some minor improvements to memory usage calculation, primarily to also include heap allocated structures themselves.
tests/manual/test_memory_reporting.py still shows similar performance.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
tests/manual/test_memory_reporting.py (1)
121-121:⚠️ Potential issue | 🟡 MinorRemove unnecessary f-prefix.
This f-string has no placeholders. Use a regular string instead.
- print(f"\nUploading snapshot to Qdrant...") + print("\nUploading snapshot to Qdrant...")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/manual/test_memory_reporting.py` at line 121, The print call uses an unnecessary f-string with no placeholders (print(f"\nUploading snapshot to Qdrant...")), so replace it with a regular string literal by removing the leading 'f' (i.e., change the print call in tests/manual/test_memory_reporting.py that prints "\nUploading snapshot to Qdrant..." to use print("\nUploading snapshot to Qdrant...")).lib/segment/src/index/field_index/map_index/immutable_map_index.rs (1)
233-260:⚠️ Potential issue | 🟠 MajorRefresh cached RAM after
remove_pointmutations.
cached_ram_usage_bytesis initialized once (Line 136) but not updated after Line 233–260 mutates in-memory containers, so reported RAM can become stale.💡 Suggested fix
pub fn remove_point(&mut self, idx: PointOffsetType) -> OperationResult<()> { @@ } self.point_to_values.remove_point(idx); + self.cached_ram_usage_bytes = self.compute_ram_usage_bytes(); Ok(()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs` around lines 233 - 260, The remove_point method mutates in-memory containers (value_to_points, value_to_points_container, deleted_value_to_points_container, point_to_values, values_count, indexed_points) but never updates cached_ram_usage_bytes, causing stale RAM reports; after performing the removals and before returning from remove_point, recompute and assign the fresh RAM usage to self.cached_ram_usage_bytes (by invoking the existing RAM-calculation helper if one exists, or by adding a short inline recalculation) so cached_ram_usage_bytes reflects the updated memory state.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/segment/src/index/field_index/immutable_point_to_values.rs`:
- Around line 47-55: The ram_usage_bytes implementation currently only sums
capacities * sizeof for PointValueEntry<N> and N, which misses heap-allocated
memory for types like String; update ram_usage_bytes (and/or add an overload) to
accept or use a heap-size estimator (e.g., a trait method like heap_size_bytes
or a passed-in closure) and include per-element heap usage by iterating
values_container and adding estimator(&value) for each element in addition to
the capacity * sizeof calculations; ensure PointValueEntry<N> and
values_container are the referenced symbols and document that heap-owning index
types should call this new estimator-enabled version (or implement the HeapSize
trait) so total RAM is reported correctly.
In `@tests/manual/test_memory_reporting.py`:
- Line 148: Remove the unused f-prefix from the print call so it is a regular
string literal; locate the print statement printing "\nCollection has:"
(currently written as print(f"\nCollection has:")) and change it to
print("\nCollection has:") to avoid an unnecessary f-string.
---
Duplicate comments:
In `@lib/segment/src/index/field_index/map_index/immutable_map_index.rs`:
- Around line 233-260: The remove_point method mutates in-memory containers
(value_to_points, value_to_points_container, deleted_value_to_points_container,
point_to_values, values_count, indexed_points) but never updates
cached_ram_usage_bytes, causing stale RAM reports; after performing the removals
and before returning from remove_point, recompute and assign the fresh RAM usage
to self.cached_ram_usage_bytes (by invoking the existing RAM-calculation helper
if one exists, or by adding a short inline recalculation) so
cached_ram_usage_bytes reflects the updated memory state.
In `@tests/manual/test_memory_reporting.py`:
- Line 121: The print call uses an unnecessary f-string with no placeholders
(print(f"\nUploading snapshot to Qdrant...")), so replace it with a regular
string literal by removing the leading 'f' (i.e., change the print call in
tests/manual/test_memory_reporting.py that prints "\nUploading snapshot to
Qdrant..." to use print("\nUploading snapshot to Qdrant...")).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1b0067d1-3c38-46e4-9d3c-16c7052aaf8d
📒 Files selected for processing (79)
lib/api/build.rslib/api/src/grpc/proto/collections_internal_service.protolib/api/src/grpc/qdrant.rslib/collection/src/collection/mod.rslib/collection/src/common/memory_reporter.rslib/collection/src/common/mod.rslib/collection/src/shards/forward_proxy_shard.rslib/collection/src/shards/local_shard/mod.rslib/collection/src/shards/proxy_shard.rslib/collection/src/shards/queue_proxy_shard.rslib/collection/src/shards/remote_shard.rslib/collection/src/shards/replica_set/mod.rslib/common/common/src/mmap/advice.rslib/common/common/src/universal_io/mmap.rslib/posting_list/src/posting_list.rslib/quantization/src/encoded_storage.rslib/quantization/src/encoded_vectors.rslib/quantization/src/encoded_vectors_binary.rslib/quantization/src/encoded_vectors_pq.rslib/quantization/src/encoded_vectors_u8.rslib/segment/src/common/memory_usage.rslib/segment/src/common/mod.rslib/segment/src/id_tracker/compressed/compressed_point_mappings.rslib/segment/src/id_tracker/compressed/external_to_internal.rslib/segment/src/id_tracker/compressed/internal_to_external.rslib/segment/src/id_tracker/compressed/versions_store.rslib/segment/src/id_tracker/immutable_id_tracker.rslib/segment/src/id_tracker/in_memory_id_tracker.rslib/segment/src/id_tracker/memory_reporter.rslib/segment/src/id_tracker/mod.rslib/segment/src/id_tracker/mutable_id_tracker.rslib/segment/src/id_tracker/point_mappings.rslib/segment/src/index/field_index/bool_index/mod.rslib/segment/src/index/field_index/bool_index/mutable_bool_index.rslib/segment/src/index/field_index/field_index_base.rslib/segment/src/index/field_index/full_text_index/immutable_text_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rslib/segment/src/index/field_index/full_text_index/inverted_index/mod.rslib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rslib/segment/src/index/field_index/full_text_index/mmap_text_index.rslib/segment/src/index/field_index/full_text_index/mutable_text_index.rslib/segment/src/index/field_index/full_text_index/text_index.rslib/segment/src/index/field_index/geo_index/immutable_geo_index.rslib/segment/src/index/field_index/geo_index/mod.rslib/segment/src/index/field_index/geo_index/mutable_geo_index.rslib/segment/src/index/field_index/histogram.rslib/segment/src/index/field_index/immutable_point_to_values.rslib/segment/src/index/field_index/map_index/immutable_map_index.rslib/segment/src/index/field_index/map_index/mod.rslib/segment/src/index/field_index/map_index/mutable_map_index.rslib/segment/src/index/field_index/memory_reporter.rslib/segment/src/index/field_index/mod.rslib/segment/src/index/field_index/null_index/mutable_null_index.rslib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rslib/segment/src/index/field_index/numeric_index/mod.rslib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rslib/segment/src/index/memory_reporter.rslib/segment/src/index/mod.rslib/segment/src/payload_storage/memory_reporter.rslib/segment/src/payload_storage/mod.rslib/segment/src/segment/memory.rslib/segment/src/segment/mod.rslib/segment/src/types.rslib/segment/src/vector_storage/chunked_vectors.rslib/segment/src/vector_storage/common.rslib/segment/src/vector_storage/dense/dense_vector_storage.rslib/segment/src/vector_storage/memory_reporter.rslib/segment/src/vector_storage/mod.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_ram_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/volatile_chunked_vectors.rssrc/actix/api/collections_api.rssrc/tonic/api/collections_internal_api.rstests/manual/test_memory_reporting.py
✅ Files skipped from review due to trivial changes (10)
- lib/segment/src/index/mod.rs
- lib/segment/src/vector_storage/mod.rs
- lib/segment/src/common/mod.rs
- lib/segment/src/id_tracker/mod.rs
- lib/segment/src/payload_storage/mod.rs
- lib/collection/src/common/mod.rs
- lib/collection/src/shards/queue_proxy_shard.rs
- lib/segment/src/index/field_index/mod.rs
- lib/segment/src/common/memory_usage.rs
- lib/common/common/src/mmap/advice.rs
🚧 Files skipped from review as they are similar to previous changes (55)
- lib/api/build.rs
- lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
- lib/segment/src/segment/mod.rs
- lib/quantization/src/encoded_vectors.rs
- lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
- lib/segment/src/index/field_index/bool_index/mod.rs
- lib/segment/src/index/field_index/null_index/mutable_null_index.rs
- lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
- lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
- lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs
- lib/segment/src/index/field_index/full_text_index/text_index.rs
- lib/segment/src/index/field_index/numeric_index/mod.rs
- lib/segment/src/index/field_index/histogram.rs
- lib/segment/src/vector_storage/dense/dense_vector_storage.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
- lib/segment/src/id_tracker/in_memory_id_tracker.rs
- lib/segment/src/id_tracker/compressed/internal_to_external.rs
- lib/segment/src/id_tracker/compressed/versions_store.rs
- lib/posting_list/src/posting_list.rs
- lib/quantization/src/encoded_vectors_u8.rs
- lib/segment/src/index/field_index/memory_reporter.rs
- lib/segment/src/vector_storage/chunked_vectors.rs
- lib/quantization/src/encoded_storage.rs
- lib/segment/src/id_tracker/memory_reporter.rs
- lib/segment/src/payload_storage/memory_reporter.rs
- lib/segment/src/id_tracker/point_mappings.rs
- lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
- lib/quantization/src/encoded_vectors_binary.rs
- lib/segment/src/id_tracker/immutable_id_tracker.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs
- lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs
- lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
- src/actix/api/collections_api.rs
- lib/segment/src/index/field_index/map_index/mutable_map_index.rs
- lib/segment/src/id_tracker/compressed/external_to_internal.rs
- lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
- lib/collection/src/shards/proxy_shard.rs
- lib/segment/src/vector_storage/volatile_chunked_vectors.rs
- lib/collection/src/shards/local_shard/mod.rs
- lib/segment/src/vector_storage/quantized/quantized_vectors.rs
- lib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rs
- lib/segment/src/vector_storage/common.rs
- lib/api/src/grpc/proto/collections_internal_service.proto
- lib/segment/src/types.rs
- lib/collection/src/shards/replica_set/mod.rs
- lib/common/common/src/universal_io/mmap.rs
- lib/segment/src/index/field_index/map_index/mod.rs
- lib/segment/src/index/memory_reporter.rs
- lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
- lib/segment/src/index/field_index/geo_index/mod.rs
- lib/collection/src/collection/mod.rs
- lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
| /// Approximate RAM usage in bytes. | ||
| pub fn ram_usage_bytes(&self) -> usize { | ||
| let Self { | ||
| point_entries, | ||
| values_container, | ||
| } = self; | ||
| point_entries.capacity() * std::mem::size_of::<PointValueEntry<N>>() | ||
| + values_container.capacity() * std::mem::size_of::<N>() | ||
| } |
There was a problem hiding this comment.
ram_usage_bytes undercounts heap-backed value types.
Line 53 and Line 54 only account for inline struct size. If N owns heap memory (notably String), this can significantly underreport ram_bytes in the new endpoint.
💡 Suggested direction
+ /// Approximate RAM usage in bytes, including optional per-value heap usage.
+ pub fn ram_usage_bytes_with(&self, heap_bytes: impl Fn(&N) -> usize) -> usize {
+ let base = self.point_entries.capacity() * std::mem::size_of::<PointValueEntry<N>>()
+ + self.values_container.capacity() * std::mem::size_of::<N>();
+
+ let point_entries_heap = self
+ .point_entries
+ .iter()
+ .map(|entry| match entry {
+ PointValueEntry::Single(v) => heap_bytes(v),
+ PointValueEntry::Slice { .. } => 0,
+ })
+ .sum::<usize>();
+
+ let values_container_heap = self.values_container.iter().map(heap_bytes).sum::<usize>();
+
+ base + point_entries_heap + values_container_heap
+ }Then call this from heap-owning index types (e.g. map index) with the proper heap estimator.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/segment/src/index/field_index/immutable_point_to_values.rs` around lines
47 - 55, The ram_usage_bytes implementation currently only sums capacities *
sizeof for PointValueEntry<N> and N, which misses heap-allocated memory for
types like String; update ram_usage_bytes (and/or add an overload) to accept or
use a heap-size estimator (e.g., a trait method like heap_size_bytes or a
passed-in closure) and include per-element heap usage by iterating
values_container and adding estimator(&value) for each element in addition to
the capacity * sizeof calculations; ensure PointValueEntry<N> and
values_container are the referenced symbols and document that heap-owning index
types should call this new estimator-enabled version (or implement the HeapSize
trait) so total RAM is reported correctly.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
📝 WalkthroughWalkthroughThis pull request introduces comprehensive memory reporting capabilities to Qdrant. It adds a new Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
tests/manual/test_memory_reporting.py (3)
45-63:⚠️ Potential issue | 🟡 MinorAdd timeouts to prevent indefinite hangs.
The helper functions
get_allocated_bytes(),get_memory_report(), andget_collection_info()make HTTP requests without timeouts. If the server becomes unresponsive, the test will hang indefinitely.Proposed fix
def get_allocated_bytes() -> int: """Get jemalloc allocated_bytes from telemetry.""" - resp = requests.get(f"{BASE_URL}/telemetry", params={"details_level": 2}) + resp = requests.get(f"{BASE_URL}/telemetry", params={"details_level": 2}, timeout=30) resp.raise_for_status() return resp.json()["result"]["memory"]["allocated_bytes"] def get_memory_report(collection_name: str) -> dict: """Get the collection memory report.""" - resp = requests.get(f"{BASE_URL}/collections/{collection_name}/memory") + resp = requests.get(f"{BASE_URL}/collections/{collection_name}/memory", timeout=30) resp.raise_for_status() return resp.json()["result"] def get_collection_info(collection_name: str) -> dict: """Get collection info.""" - resp = requests.get(f"{BASE_URL}/collections/{collection_name}") + resp = requests.get(f"{BASE_URL}/collections/{collection_name}", timeout=30) resp.raise_for_status() return resp.json()["result"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/manual/test_memory_reporting.py` around lines 45 - 63, The three helpers get_allocated_bytes(), get_memory_report(), and get_collection_info() call requests.get without a timeout which can cause tests to hang; update each requests.get call to pass a sensible timeout (e.g., timeout=10 or a shared constant like REQUEST_TIMEOUT_SECONDS) so the test fails fast on unresponsive servers, preserving resp.raise_for_status() and the existing JSON parsing logic in each function.
109-121:⚠️ Potential issue | 🟡 MinorAdd timeouts and remove unnecessary f-prefix.
Lines 110 and 115 need timeouts, and line 121 has an f-string without placeholders.
Proposed fix
# Verify server is up - resp = requests.get(f"{BASE_URL}/collections") + resp = requests.get(f"{BASE_URL}/collections", timeout=10) resp.raise_for_status() print("Qdrant is running.") # Clean up any leftover collection from a previous run - requests.delete(f"{BASE_URL}/collections/{COLLECTION_NAME}", params={"wait": "true"}) + requests.delete(f"{BASE_URL}/collections/{COLLECTION_NAME}", params={"wait": "true"}, timeout=60) # Get snapshot from local cache (download once if needed) snapshot_path = get_cached_snapshot() # Upload snapshot to Qdrant - print(f"\nUploading snapshot to Qdrant...") + print("\nUploading snapshot to Qdrant...")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/manual/test_memory_reporting.py` around lines 109 - 121, The requests calls lack timeouts and the print uses an unnecessary f-string; update the requests.get and requests.delete calls to include a timeout parameter (e.g., timeout=10) to avoid hanging (modify the call sites in this snippet where requests.get(...) and requests.delete(..., params=...) are used), and change print(f"\nUploading snapshot to Qdrant...") to print("\nUploading snapshot to Qdrant...") to remove the unused f-prefix; keep the existing params and behavior of get_cached_snapshot() and resp.raise_for_status() unchanged.
77-93:⚠️ Potential issue | 🟡 MinorAdd timeouts to delete operations.
The delete helper functions should also have timeouts to prevent hanging.
Proposed fix
def delete_payload_index(collection_name: str, field_name: str): """Delete a payload index.""" resp = requests.delete( f"{BASE_URL}/collections/{collection_name}/index/{field_name}", params={"wait": "true"}, + timeout=120, ) resp.raise_for_status() def delete_vector_field(collection_name: str, vector_name: str): """Delete a named vector field entirely.""" resp = requests.delete( f"{BASE_URL}/collections/{collection_name}/vectors/{vector_name}", params={"wait": "true"}, + timeout=120, ) resp.raise_for_status()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/manual/test_memory_reporting.py` around lines 77 - 93, The two helper functions delete_payload_index and delete_vector_field lack request timeouts and can hang; update both to pass a timeout (e.g., timeout=10) to requests.delete (same place where params is passed) so the HTTP call times out predictably, then keep resp.raise_for_status() unchanged to surface errors.
🧹 Nitpick comments (2)
lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs (1)
144-146: Avoid duplicating heap-size logic inDocument.
heap_size()currently re-implements the exact same formula asheap_bytes(). This can drift later; make one delegate to the other.♻️ Proposed refactor
pub fn heap_size(&self) -> usize { - self.0.capacity() * std::mem::size_of::<TokenId>() + self.heap_bytes() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs` around lines 144 - 146, The heap_size() implementation duplicates the same formula as heap_bytes(), so change heap_size() to delegate to the existing heap_bytes() calculation instead of re-implementing it: replace the body of heap_size() to simply return the result of heap_bytes() (ensuring types match, i.e., usize) so only heap_bytes() contains the actual logic; update references to TokenId only in heap_bytes() if needed and keep function names heap_size and heap_bytes as the single source of truth.lib/segment/src/payload_storage/memory_reporter.rs (1)
9-12: Avoid silent0fallback on RAM estimate failure.At Line 11, swallowing errors with
unwrap_or(0)can hide estimation failures and produce misleading reports. Consider logging the failure before fallback.💡 Suggested adjustment
impl MemoryReporter for PayloadStorageEnum { fn memory_usage(&self) -> ComponentMemoryUsage { match self { #[cfg(feature = "testing")] PayloadStorageEnum::InMemoryPayloadStorage(s) => { - // Purely in-memory, no files. Approximate RAM from storage size. - ComponentMemoryUsage::ram_only(s.get_storage_size_bytes().unwrap_or(0) as u64) + // Purely in-memory, no files. Approximate RAM from storage size. + let bytes = match s.get_storage_size_bytes() { + Ok(v) => v as u64, + Err(err) => { + log::warn!("Failed to estimate in-memory payload storage size: {err}"); + 0 + } + }; + ComponentMemoryUsage::ram_only(bytes) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/segment/src/payload_storage/memory_reporter.rs` around lines 9 - 12, Replace the silent unwrap_or(0) in the PayloadStorageEnum::InMemoryPayloadStorage branch: call s.get_storage_size_bytes(), match its Result, and on Err log the error with details (using the crate's logger/tracing) before falling back to 0; then pass the numeric size into ComponentMemoryUsage::ram_only so failures are recorded instead of silently producing 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/segment/src/vector_storage/quantized/quantized_vectors.rs`:
- Around line 2011-2028: The memory reporter for QuantizedVectors is hardcoding
FileStorageIntent::OnDisk; change it to use self.storage_impl.is_on_disk() to
choose intent (use FileStorageIntent::Cached when is_on_disk() is false,
FileStorageIntent::OnDisk when true) inside QuantizedVectors::memory_usage so
the call to ComponentMemoryUsage::from_files_and_ram and
ComponentMemoryUsage::from_files receives the correct intent; locate this logic
in the memory_usage method of the QuantizedVectors impl and replace the fixed
intent assignment with the conditional based on storage_impl.is_on_disk().
---
Duplicate comments:
In `@tests/manual/test_memory_reporting.py`:
- Around line 45-63: The three helpers get_allocated_bytes(),
get_memory_report(), and get_collection_info() call requests.get without a
timeout which can cause tests to hang; update each requests.get call to pass a
sensible timeout (e.g., timeout=10 or a shared constant like
REQUEST_TIMEOUT_SECONDS) so the test fails fast on unresponsive servers,
preserving resp.raise_for_status() and the existing JSON parsing logic in each
function.
- Around line 109-121: The requests calls lack timeouts and the print uses an
unnecessary f-string; update the requests.get and requests.delete calls to
include a timeout parameter (e.g., timeout=10) to avoid hanging (modify the call
sites in this snippet where requests.get(...) and requests.delete(...,
params=...) are used), and change print(f"\nUploading snapshot to Qdrant...") to
print("\nUploading snapshot to Qdrant...") to remove the unused f-prefix; keep
the existing params and behavior of get_cached_snapshot() and
resp.raise_for_status() unchanged.
- Around line 77-93: The two helper functions delete_payload_index and
delete_vector_field lack request timeouts and can hang; update both to pass a
timeout (e.g., timeout=10) to requests.delete (same place where params is
passed) so the HTTP call times out predictably, then keep
resp.raise_for_status() unchanged to surface errors.
---
Nitpick comments:
In `@lib/segment/src/index/field_index/full_text_index/inverted_index/mod.rs`:
- Around line 144-146: The heap_size() implementation duplicates the same
formula as heap_bytes(), so change heap_size() to delegate to the existing
heap_bytes() calculation instead of re-implementing it: replace the body of
heap_size() to simply return the result of heap_bytes() (ensuring types match,
i.e., usize) so only heap_bytes() contains the actual logic; update references
to TokenId only in heap_bytes() if needed and keep function names heap_size and
heap_bytes as the single source of truth.
In `@lib/segment/src/payload_storage/memory_reporter.rs`:
- Around line 9-12: Replace the silent unwrap_or(0) in the
PayloadStorageEnum::InMemoryPayloadStorage branch: call
s.get_storage_size_bytes(), match its Result, and on Err log the error with
details (using the crate's logger/tracing) before falling back to 0; then pass
the numeric size into ComponentMemoryUsage::ram_only so failures are recorded
instead of silently producing 0.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: acfc6c5a-e421-4ef2-87a0-38f1f688dac5
📒 Files selected for processing (79)
lib/api/build.rslib/api/src/grpc/proto/collections_internal_service.protolib/api/src/grpc/qdrant.rslib/collection/src/collection/mod.rslib/collection/src/common/memory_reporter.rslib/collection/src/common/mod.rslib/collection/src/shards/forward_proxy_shard.rslib/collection/src/shards/local_shard/mod.rslib/collection/src/shards/proxy_shard.rslib/collection/src/shards/queue_proxy_shard.rslib/collection/src/shards/remote_shard.rslib/collection/src/shards/replica_set/mod.rslib/common/common/src/mmap/advice.rslib/common/common/src/universal_io/mmap.rslib/posting_list/src/posting_list.rslib/quantization/src/encoded_storage.rslib/quantization/src/encoded_vectors.rslib/quantization/src/encoded_vectors_binary.rslib/quantization/src/encoded_vectors_pq.rslib/quantization/src/encoded_vectors_u8.rslib/segment/src/common/memory_usage.rslib/segment/src/common/mod.rslib/segment/src/id_tracker/compressed/compressed_point_mappings.rslib/segment/src/id_tracker/compressed/external_to_internal.rslib/segment/src/id_tracker/compressed/internal_to_external.rslib/segment/src/id_tracker/compressed/versions_store.rslib/segment/src/id_tracker/immutable_id_tracker.rslib/segment/src/id_tracker/in_memory_id_tracker.rslib/segment/src/id_tracker/memory_reporter.rslib/segment/src/id_tracker/mod.rslib/segment/src/id_tracker/mutable_id_tracker.rslib/segment/src/id_tracker/point_mappings.rslib/segment/src/index/field_index/bool_index/mod.rslib/segment/src/index/field_index/bool_index/mutable_bool_index.rslib/segment/src/index/field_index/field_index_base.rslib/segment/src/index/field_index/full_text_index/immutable_text_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rslib/segment/src/index/field_index/full_text_index/inverted_index/mod.rslib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rslib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rslib/segment/src/index/field_index/full_text_index/mmap_text_index.rslib/segment/src/index/field_index/full_text_index/mutable_text_index.rslib/segment/src/index/field_index/full_text_index/text_index.rslib/segment/src/index/field_index/geo_index/immutable_geo_index.rslib/segment/src/index/field_index/geo_index/mod.rslib/segment/src/index/field_index/geo_index/mutable_geo_index.rslib/segment/src/index/field_index/histogram.rslib/segment/src/index/field_index/immutable_point_to_values.rslib/segment/src/index/field_index/map_index/immutable_map_index.rslib/segment/src/index/field_index/map_index/mod.rslib/segment/src/index/field_index/map_index/mutable_map_index.rslib/segment/src/index/field_index/memory_reporter.rslib/segment/src/index/field_index/mod.rslib/segment/src/index/field_index/null_index/mutable_null_index.rslib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rslib/segment/src/index/field_index/numeric_index/mod.rslib/segment/src/index/field_index/numeric_index/mutable_numeric_index.rslib/segment/src/index/memory_reporter.rslib/segment/src/index/mod.rslib/segment/src/payload_storage/memory_reporter.rslib/segment/src/payload_storage/mod.rslib/segment/src/segment/memory.rslib/segment/src/segment/mod.rslib/segment/src/types.rslib/segment/src/vector_storage/chunked_vectors.rslib/segment/src/vector_storage/common.rslib/segment/src/vector_storage/dense/dense_vector_storage.rslib/segment/src/vector_storage/memory_reporter.rslib/segment/src/vector_storage/mod.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_ram_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/volatile_chunked_vectors.rssrc/actix/api/collections_api.rssrc/tonic/api/collections_internal_api.rstests/manual/test_memory_reporting.py
✅ Files skipped from review due to trivial changes (15)
- lib/api/build.rs
- lib/collection/src/common/mod.rs
- lib/segment/src/index/field_index/mod.rs
- lib/segment/src/id_tracker/compressed/internal_to_external.rs
- lib/segment/src/id_tracker/mod.rs
- lib/segment/src/vector_storage/mod.rs
- lib/posting_list/src/posting_list.rs
- lib/segment/src/index/field_index/histogram.rs
- lib/segment/src/payload_storage/mod.rs
- lib/segment/src/index/field_index/numeric_index/mod.rs
- lib/segment/src/index/field_index/immutable_point_to_values.rs
- lib/segment/src/vector_storage/chunked_vectors.rs
- lib/segment/src/index/mod.rs
- lib/segment/src/common/memory_usage.rs
- lib/common/common/src/universal_io/mmap.rs
🚧 Files skipped from review as they are similar to previous changes (38)
- lib/quantization/src/encoded_vectors.rs
- lib/collection/src/shards/forward_proxy_shard.rs
- lib/collection/src/shards/queue_proxy_shard.rs
- lib/quantization/src/encoded_vectors_u8.rs
- lib/api/src/grpc/proto/collections_internal_service.proto
- lib/segment/src/id_tracker/in_memory_id_tracker.rs
- lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
- lib/segment/src/index/field_index/bool_index/mod.rs
- lib/segment/src/index/field_index/null_index/mutable_null_index.rs
- lib/segment/src/id_tracker/point_mappings.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/immutable_postings_enum.rs
- lib/segment/src/id_tracker/compressed/external_to_internal.rs
- lib/segment/src/id_tracker/mutable_id_tracker.rs
- lib/segment/src/segment/mod.rs
- lib/segment/src/index/field_index/full_text_index/immutable_text_index.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/posting_list.rs
- lib/segment/src/vector_storage/dense/dense_vector_storage.rs
- lib/collection/src/shards/local_shard/mod.rs
- lib/collection/src/shards/remote_shard.rs
- lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
- lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs
- lib/segment/src/index/field_index/geo_index/mod.rs
- lib/segment/src/index/field_index/geo_index/immutable_geo_index.rs
- lib/segment/src/index/field_index/geo_index/mutable_geo_index.rs
- lib/segment/src/index/field_index/full_text_index/inverted_index/mutable_inverted_index.rs
- lib/common/common/src/mmap/advice.rs
- lib/quantization/src/encoded_vectors_pq.rs
- lib/segment/src/vector_storage/common.rs
- lib/segment/src/types.rs
- lib/segment/src/vector_storage/volatile_chunked_vectors.rs
- lib/collection/src/shards/replica_set/mod.rs
- lib/segment/src/id_tracker/memory_reporter.rs
- src/tonic/api/collections_internal_api.rs
- lib/segment/src/index/field_index/numeric_index/immutable_numeric_index.rs
- lib/segment/src/index/field_index/map_index/immutable_map_index.rs
- lib/segment/src/index/memory_reporter.rs
- lib/segment/src/index/field_index/map_index/mod.rs
- lib/segment/src/index/field_index/full_text_index/mutable_text_index.rs
|
|
||
| impl crate::common::memory_usage::MemoryReporter for QuantizedVectors { | ||
| fn memory_usage(&self) -> crate::common::memory_usage::ComponentMemoryUsage { | ||
| use crate::common::memory_usage::{ComponentMemoryUsage, FileStorageIntent}; | ||
|
|
||
| let files = self.files(); | ||
| let heap_bytes = self.storage_impl.heap_size_bytes() as u64; | ||
|
|
||
| // Either always_ram, then we only load on in ram and track heap_bytes | ||
| // Or full on_disk, and we don't preload anything | ||
| let intent = FileStorageIntent::OnDisk; | ||
|
|
||
| if heap_bytes > 0 { | ||
| ComponentMemoryUsage::from_files_and_ram(files, intent, heap_bytes) | ||
| } else { | ||
| ComponentMemoryUsage::from_files(files, intent) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for how FileStorageIntent is used elsewhere to understand the expected behavior
rg -n -A5 "FileStorageIntent" --type rust | head -100Repository: qdrant/qdrant
Length of output: 8355
🏁 Script executed:
# Find the QuantizedVectors struct definition and its storage_impl field
rg -n "struct QuantizedVectors" lib/segment/src/vector_storage/quantized/ -A 15Repository: qdrant/qdrant
Length of output: 3163
🏁 Script executed:
# Search for storage variants and their characteristics (RAM vs MMAP)
rg -n "enum QuantizedStorageVariant|QuantizedStorageImpl" lib/segment/src/vector_storage/quantized/ -A 5Repository: qdrant/qdrant
Length of output: 39
🏁 Script executed:
# Check if there's an is_on_disk method or similar in QuantizedVectors
rg -n "is_on_disk|OnDisk|Cached" lib/segment/src/vector_storage/quantized/quantized_vectors.rs -B 2 -A 2Repository: qdrant/qdrant
Length of output: 3735
🏁 Script executed:
# Check the RAM storage variants' is_on_disk implementation
rg -n "is_on_disk" lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs -B 2 -A 5Repository: qdrant/qdrant
Length of output: 209
🏁 Script executed:
# Look at the vector_storage/memory_reporter.rs pattern for quantized storage
sed -n '1,120p' lib/segment/src/vector_storage/memory_reporter.rs | cat -nRepository: qdrant/qdrant
Length of output: 5588
🏁 Script executed:
# Check if QuantizedVectors uses the from_files_with_on_disk helper like others do
rg -n "from_files_with_on_disk" lib/segment/src/vector_storage/Repository: qdrant/qdrant
Length of output: 1468
🏁 Script executed:
# Check the heap_size_bytes implementation to understand the full picture
rg -n "heap_size_bytes" lib/segment/src/vector_storage/quantized/ -B 2 -A 3Repository: qdrant/qdrant
Length of output: 9509
Use self.storage_impl.is_on_disk() to determine the FileStorageIntent instead of hardcoding OnDisk.
The implementation should follow the pattern established in vector_storage/memory_reporter.rs: when is_on_disk() is false, the intent should be Cached (data was populated on load); when true, it should be OnDisk (data relies on OS page cache). RAM variants currently return is_on_disk() = false, meaning they are incorrectly reported as OnDisk instead of Cached.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/segment/src/vector_storage/quantized/quantized_vectors.rs` around lines
2011 - 2028, The memory reporter for QuantizedVectors is hardcoding
FileStorageIntent::OnDisk; change it to use self.storage_impl.is_on_disk() to
choose intent (use FileStorageIntent::Cached when is_on_disk() is false,
FileStorageIntent::OnDisk when true) inside QuantizedVectors::memory_usage so
the call to ComponentMemoryUsage::from_files_and_ram and
ComponentMemoryUsage::from_files receives the correct intent; locate this logic
in the memory_usage method of the QuantizedVectors impl and replace the fixed
intent assignment with the conditional based on storage_impl.is_on_disk().
PR #8606 changed `DenseVectorStorageImpl::is_on_disk()` from hard-coded `true` to `!self.populated`. The method is load-bearing beyond reporting: `quantized_vectors.rs::is_ram()` reads it to decide whether to materialize the quantized blob as `QuantizedRamStorage` (heap) or `QuantizedMmapStorage` (disk) for every quantized variant (scalar, pq, binary, turbo). For user configs with `on_disk: false` + quantization (no `always_ram`), the flip caused the quantized storage variant to move from disk to RAM, which in turn triggered segment rebuilds and transient 4x storage spikes during crasher-CI runs, hitting ENOSPC within 12-16 min vs. a stable 60 min on the parent commit. Restore `is_on_disk() -> true` and expose `populated()` as a separate accessor. Memory reporter now reads `populated()` to pick `FileStorageIntent::Cached` vs. `OnDisk` -- the distinction the original PR wanted to surface -- without changing quantization placement. Bisect: 4/4 green on aa3af83, 4/4 red on 5a899b7 (#8606).
…flip Drives the appendable -> immutable SegmentBuilder path that crasher's optimizer takes in production, with `on_disk: false` dense vector and binary quantization (`always_ram: Some(false)`). Asserts the resulting quantized storage is `BinaryMmap` (on disk), not `BinaryRam` (heap). The variant picked depends on `DenseVectorStorageImpl::is_on_disk()`; PR #8606 flipped it to `!populated`, which makes this test fail with `BinaryRam`. The fix in the previous commit restores the semantic and this test passes. Verified: fails before fix (BinaryRam), passes after (BinaryMmap).
* Add Monitor Collection Memory Usage page (v1.18.0)
New page in Monitoring & Telemetry documenting the Web UI Memory tab and
the GET /collections/{name}/memory API endpoint added in qdrant/qdrant#8606
and qdrant/qdrant-web-ui#372.
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
* Update web-ui docs link
* Review feedback
---------
Co-authored-by: Claude Sonnet 4.6 <[email protected]>
* Add mincore-based memory stats to MmapFile Add `resident_bytes()`, `disk_bytes()`, and `probe_memory_stats()` methods to `MmapFile` for measuring page cache residency via `mincore(2)`. This is the foundation for per-collection memory usage reporting. Also extract `page_size()` as a public function in `mmap::advice`, replacing the internal `PAGE_SIZE_MASK` with a direct page size cache. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * [AI] introduce trait for reporting memory usage per component * [AI] memory reporter implementation for vector storage * [AI] implement MemoryReporter for QuantizedVectors * [AI] implement MemoryReporter for VectorIndexEnum * Implement MemoryReporter for IdTrackerEnum with RAM estimation Add ram_usage_bytes() to all ID tracker types and their data structures: - PointMappings, CompressedPointMappings, CompressedVersions, CompressedInternalToExternal, CompressedExternalToInternal - MutableIdTracker, ImmutableIdTracker, InMemoryIdTracker All ID trackers load their data into RAM (none use mmap for working data). Files are reported as OnDisk (persistence only), actual RAM footprint is reported via extra_ram_bytes. Uses struct destructuring to ensure new fields trigger compile errors if not accounted for. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * [AI] implement MemoryReporter for PayloadStorageEnum and adjust FileStorageIntent * [AI] implement MemoryReporter for PayloadStorageEnum and adjust FileStorageIntent * [AI] implement MemoryReporter for payload indexes: in-ram structures memory consumtion computation + caching * [AI] implement MemoryReporter for payload indexes: in-ram structures memory consumtion computation + caching * [AI] segment-level memory usage report * [AI] Block 3: Aggregation Layer and Data Model + internal api for remote shard * [AI] REST API handler * fmt * [AI] clippy fixes * [AI] macos fix + proxy segment fix * [AI] make text index estimation a bit more correct * fix is_on_disk reporting for dense_vector_storage * fix after rebase * [AI] deep account for quantized vectors RAM usage + unify chunk size + shring volatile storage after load * remove debug log * cache in test * make manual test easier to run * rollback chunk size diff, but keep it for test only * review fixes * Use exhaustive match * Use div_ceil on bits everywhere It does not seem to be strictly necessary because the number of bits should already be a multiple of the used container size bytes. Still it's good practice to be careful with this calculation. * Improve heap size bytes for encoded product quantization vectors * Include vector stats for binary quantized vectors * In volatile chunked vectors, include heap allocated vector * Include rest of heap allocated structures for mutable map index * In mutable geo index, the hash map is also heap allocated * Update tests/manual/test_memory_reporting.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: timvisee <[email protected]> Co-authored-by: Tim Visée <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This PR exposes per-collection per-component memory usage:
Each component reports following numbers about memory usage:
Disk cache is estimated using
mincore. It opens all files of the component as a mmap and performs syscall to evaluate cache size.It is planned for the future to re-use already opened mmaps from the pool, so we won't need to spend time opening and closing mmap if it is already opened.
ram_bytesis estimated with custom function for each component. It is the least accurate measurement in this API, as we can't take into account many factors:This PR provides a manual test, which estimates error by comparing jemalloc reading before and after deleting components. Current results are:
Overall, I would expect around 10-15% ram usage underestimation from this API.
This API also fetches memory usage from all nodes in the cluster, so we avoid blinking problem in distributed deployments with round-robin load balancer.