Add memory::madvise::will_need_multiple_pages()#7124
Conversation
📝 WalkthroughWalkthrough
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
lib/common/memory/src/madvise.rs (4)
193-196: Off-by-one in single-page check (causes unnecessary madvise at exact page boundary)When
offset_in_page + region.len() == page_size, the region still fits within a single page, but the current condition triggersmadvise. Harmless, but contrary to the doc promise and adds syscall overhead.Apply this diff:
- if length <= page_mask { + let page_size = page_mask + 1; + if length <= page_size { // Data fits within a single page, do nothing. return; }
176-183: Doc wording: avoid hardcoding 4KiB; use “system page size”Some targets (e.g., AArch64) commonly use 64KiB pages. Suggest making the docs platform-neutral.
-/// Use-case: the `region` is inside `MADV_RANDOM` memory map, but it spans -/// across more than one 4KiB page. If you read it in sequence, it will cause +/// Use-case: the `region` is inside a `MADV_RANDOM` memory map, but it spans +/// across more than one OS page (page size is system-dependent). If you read it in sequence, it will cause -/// multiple page faults, thus multiple 4KiB I/O operations. Avoid this by +/// multiple page faults, thus multiple page-sized I/O operations. Avoid this by /// calling this function before reading the region. It will prefetch the whole /// region in a single I/O operation. (if possible)
201-208: Consider tracing errors in release builds (debug panics are fine)In debug you panic on
madvisefailure; in release failures are silently ignored. A low-verbosity log (trace/debug) would help diagnose unexpected performance regressions without panicking.let res = unsafe { nix::libc::madvise(addr as *mut _, length, nix::libc::MADV_WILLNEED) }; if res != 0 { #[cfg(debug_assertions)] { let err = io::Error::last_os_error(); panic!("Failed to call madvise(MADV_WILLNEED): {err}"); } + #[cfg(not(debug_assertions))] + { + let err = io::Error::last_os_error(); + log::trace!("madvise(MADV_WILLNEED) failed (ignored): {err}"); + } }
214-232: Minor: return the page size mask name aligns with value, but the doc can be clearer
PAGE_SIZE_MASKstorespage_size - 1. The doc line says “Typically 0xfff for 4KiB pages” which is correct; consider also mentioning it’spage_size - 1to aid future readers.-/// Page size mask. Typically 0xfff for 4KiB pages. +/// Page size mask (`page_size - 1`). Typically `0xfff` for 4KiB pages.lib/segment/src/index/hnsw_index/graph_links/view.rs (1)
271-275: Optional: prefetch vectors only when links block is tinyIf profiles show many small neighbor blocks barely spanning a page, consider moving the hint to just before reading vectors (after computing
pos), and pass&neighbors[pos..end]. This narrows readahead to the typically largest portion. Keep as-is if the broader prefetch proves better in practice.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
Cargo.toml(1 hunks)lib/common/memory/src/madvise.rs(1 hunks)lib/segment/src/index/hnsw_index/graph_links/view.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
📚 Learning: 2025-08-06T09:56:59.311Z
Learnt from: xzfc
PR: qdrant/qdrant#6982
File: lib/segment/src/index/hnsw_index/graph_links/view.rs:217-220
Timestamp: 2025-08-06T09:56:59.311Z
Learning: In Qdrant's GraphLinksView implementation (lib/segment/src/index/hnsw_index/graph_links/view.rs), methods like links() and links_with_vectors() intentionally use unwrap() and can panic for performance reasons, maintaining consistency across similar methods. Error handling improvements are considered secondary to feature completion and require benchmarking before implementation.
Applied to files:
lib/segment/src/index/hnsw_index/graph_links/view.rs
🧬 Code graph analysis (1)
lib/segment/src/index/hnsw_index/graph_links/view.rs (1)
lib/common/memory/src/madvise.rs (6)
madvise(93-95)madvise(101-101)madvise(107-113)madvise(131-137)will_need_multiple_pages(184-209)will_need_multiple_pages(212-212)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: e2e-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: lint
🔇 Additional comments (2)
Cargo.toml (1)
229-229: Fix nix feature gating: use "unistd" (not "feature") to enable sysconf()
get_page_mask()usesnix::unistd::sysconf, which requires theunistdfeature."feature"is not a validnixfeature and will cause a compile-time error.Apply this diff:
-nix = { version = "0.29", features = ["fs", "feature"] } +nix = { version = "0.29", features = ["fs", "unistd"] }Likely an incorrect or invalid review comment.
lib/segment/src/index/hnsw_index/graph_links/view.rs (1)
271-275: LGTM: targeted readahead hint before decoding links + vectorsCalling
memory::madvise::will_need_multiple_pages(&neighbors[start..end])here is well-placed: it prefetches exactly the slice that includes the varint length, compressed links, padding, and vectors, and becomes a no-op for single-page regions or non-Unix builds. Minimal overhead due to the early page-span check andLazyLockcaching.
This PR adds
will_need_multiple_pageswith the following doc/signature, and enables it in theGraphLinksView::links_with_vectorsiterator.I've experimented with benchmarking various approaches using
pread,posix_fadvise,madvise,readahead. All of them show similar performance (compared to raw access to mmap), so I choosemadviseas it doesn't require file descriptors (and we don't store them for mmaps).benchmark.cI've benchmarked average search time with this change on my hnsw-with-vector branch (not published yet). Parameters: m=16, 2-bit binary quantization, LAION 400M dataset.
Perhaps later we can also use this function when accessing regular non-quantized vectors from the storage.