Don't encode quantization query while hnsw build#6729
Conversation
|
|
||
| fn quantized_vector_size(&self) -> usize; | ||
|
|
||
| fn encode_internal_vector(&self, id: u32) -> Option<Self::EncodedQuery>; |
There was a problem hiding this comment.
Create query from the internal vector. Query is always in RAM
| } | ||
|
|
||
| fn encode_internal_vector(&self, _id: u32) -> Option<EncodedQueryPQ> { | ||
| None |
There was a problem hiding this comment.
We cannot create query in PQ from quantized vector without LUT accuracy loss.
|
|
||
| impl<'a, TElement, TMetric, TEncodedVectors> | ||
| QuantizedQueryScorer<'a, TElement, TMetric, TEncodedVectors> | ||
| pub enum QuantizedInternalScorerResult<'a, TEncodedVectors> |
There was a problem hiding this comment.
Result of internal scorer creation. It can be constructed scorer or nothing but we want to get hardware counter ownership back
📝 WalkthroughWalkthroughThis change introduces a new method, Possibly related PRs
Suggested labels
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
⏰ Context from checks skipped due to timeout of 90000ms (14)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
172-348: Consider using a macro to reduce code repetition.The implementation correctly handles all storage variants with a consistent pattern. While the repetition is necessary due to Rust's type system, consider extracting the common pattern into a macro to improve maintainability:
macro_rules! handle_internal_scorer { ($quantized_data:expr, $point_id:expr, $hardware_counter:expr, $original_query:expr) => { match QuantizedQueryScorer::new_internal($point_id, $quantized_data, $hardware_counter) { QuantizedInternalScorerResult::Scorer(scorer) => raw_scorer_from_query_scorer(scorer), QuantizedInternalScorerResult::NotSupported(hw) => $original_query(hw), } }; }However, if you prefer explicit code for better readability, the current implementation is acceptable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
lib/quantization/src/encoded_vectors.rs(1 hunks)lib/quantization/src/encoded_vectors_binary.rs(1 hunks)lib/quantization/src/encoded_vectors_pq.rs(1 hunks)lib/quantization/src/encoded_vectors_u8.rs(1 hunks)lib/segment/src/index/hnsw_index/hnsw.rs(4 hunks)lib/segment/src/index/hnsw_index/point_scorer.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs(3 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs(2 hunks)
🔇 Additional comments (11)
lib/quantization/src/encoded_vectors.rs (1)
49-49: Well-designed trait method addition.The
encode_internal_vectormethod is properly designed with anOptionreturn type, allowing implementations to opt out when the optimization isn't feasible (e.g., PQ quantization). This aligns perfectly with the PR objective to reuse quantized data during HNSW construction.lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
152-157: LGTM - Clean generic parameter refactoring.Moving the generic parameters from the struct to the method call is a clean refactoring that maintains the same functionality while supporting the new internal vector encoding capabilities.
lib/quantization/src/encoded_vectors_pq.rs (1)
578-580: Correct implementation for PQ limitations.Returning
Noneis the appropriate behavior for PQ quantization, as noted in the existing comment that creating a query from quantized vector would result in LUT accuracy loss. This maintains the optimization's optional nature.lib/quantization/src/encoded_vectors_binary.rs (1)
573-581: Solid implementation for binary quantization.The implementation correctly retrieves the encoded vector data, performs the necessary type transmutation, and creates an owned
EncodedBinVector. The use ofto_vec()ensures proper ownership semantics for the returned data.lib/quantization/src/encoded_vectors_u8.rs (1)
511-519: Correct implementation for U8 quantization.The implementation properly extracts both the vector offset and encoded data using
get_vec_ptr, creates a safe slice from the raw pointer, and constructs an ownedEncodedQueryU8. This maintains all necessary information for accurate scoring.lib/segment/src/index/hnsw_index/hnsw.rs (1)
420-427: Good optimization for quantized vector scoring.The refactoring to use
FilteredScorer::new_internalavoids unnecessary vector loading and allows the scorer to work directly with quantized data when available, improving performance.lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
296-304: Correct implementation of internal vector encoding for multivectors.The method properly handles the multivector case by encoding each internal vector and collecting the results. The early return pattern ensures consistent error handling.
lib/segment/src/index/hnsw_index/point_scorer.rs (1)
77-108: Well-designed internal scorer constructor with lazy vector loading.The implementation efficiently handles both quantized and non-quantized cases. The closure pattern for
original_query_fnensures the vector is only loaded when necessary, avoiding unnecessary memory operations for quantized vectors.lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (3)
21-27: Clean result type for internal scorer creation.The
QuantizedInternalScorerResultenum provides a clear way to handle cases where internal vector encoding is not supported, while preserving hardware counter ownership as noted in the existing comment.
33-59: Good refactoring: method-level generics improve flexibility.Moving the generic parameters from the struct to the method level allows for more flexible usage patterns while maintaining type safety.
61-76: Efficient internal scorer construction.The method correctly handles the optional encoding case and properly configures the hardware counter's IO multiplier based on storage location.
| } | ||
|
|
||
| fn encode_internal_vector(&self, id: u32) -> Option<EncodedQueryU8> { | ||
| let (query_offset, q_ptr) = self.get_vec_ptr(id); |
There was a problem hiding this comment.
ToDo for another PR: avoid usage of pointers. Pretty suze zero-copy can handle it just fine
generall
left a comment
There was a problem hiding this comment.
This changes are not exactly pretty, but I don't have any better suggestions
* dont encode quantization query while hnsw build * add comments --------- Co-authored-by: generall <[email protected]>
Currently, we do quantization encoding each time we create
RawScorer. In the case of HNSW construction, it is not required because we already have a quantized storage. We want to reuse already quantized data in HNSW construction and avoid unnecessary access to original vector data, which can be on disk.This PR fixes this behaviour and uses already constructed quantized storage in raw scorer creation.
To achieve this, this PR introduces a new method in quantization storage:
fn encode_internal_vector(&self, id: u32) -> Option<Self::EncodedQuery>;. It takes a point id and returns an encoded query, which is used in the query scorer. It's optional because not every case can be done that way. In PQ we still want to encode the original vector because using the encoded one will produce accuracy loss in LUT. Everything is fine in the case of SQ and BQ?We don't call
score_internaldirectly, instead this idea of a query getter because quantized data can be stored on disk and we want to have query vector always in RAM.Quantized vector scorer has a new construction method which returns scorer or fall bach the ownership of hardware counter. Also a new constructor has a Filtered Scorer.