Conversation
timvisee
left a comment
There was a problem hiding this comment.
We currently don't expose a transfer filter publicly in collection info or in telemetry, but we don't see this as critical problem.
Relevant:
qdrant/lib/collection/src/shards/shard_holder/mod.rs
Lines 490 to 505 in c3d2d5d
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/api/src/grpc/qdrant.rs (1)
1851-1945: Move shard-key validation from conversion layer to gRPC validation layerThe
ReplicatePointsmessage correctly rejects missingfrom_shard_keyandto_shard_key(both checked inlib/collection/src/operations/conversions.rs:1695,1700with clear error messages), so invalid shard-key combinations are not silently ignored. However, validation occurs in the conversion layer rather than at the gRPC boundary. For consistency with other cluster operations (e.g.,MoveShardvalidates atlib/api/src/grpc/validate.rs:116–125), move the required-field checks from conversions.rs intolib/api/src/grpc/validate.rs:185–188.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
lib/api/src/grpc/proto/collections.proto(4 hunks)lib/api/src/grpc/proto/common.proto(1 hunks)lib/api/src/grpc/proto/points.proto(1 hunks)lib/api/src/grpc/proto/points_internal_service.proto(1 hunks)lib/api/src/grpc/qdrant.rs(6 hunks)lib/api/src/grpc/validate.rs(2 hunks)lib/collection/src/operations/conversions.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: generall
Repo: qdrant/qdrant PR: 7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: docs/redoc/master/openapi.json:13645-13652
Timestamp: 2025-08-10T18:26:12.443Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated from the REST schemas. To change field docs, edit lib/api/src/rest/schema.rs (e.g., add doc comments or #[schemars(description = ...)]). Specifically, UpdateVectors.update_filter lacked a description and should state: "If specified, only update vectors for points that match this filter; points not matching the filter are left unchanged."
Learnt from: coszio
Repo: qdrant/qdrant PR: 7368
File: lib/shard/src/query/mod.rs:313-341
Timestamp: 2025-10-16T13:00:46.809Z
Learning: In the Qdrant codebase, for the `Feedback` query variant in `lib/shard/src/query/mod.rs`, validation for sparse vectors should happen down the line rather than early in the `query_enum_from_grpc_raw_query` function. The `Nearest` variant is treated as a special case where early validation with better error messages is acceptable.
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.
Learnt from: coszio
Repo: qdrant/qdrant PR: 7065
File: lib/collection/src/operations/universal_query/shard_query.rs:0-0
Timestamp: 2025-08-15T15:56:36.821Z
Learning: Internal gRPC APIs in Qdrant don't require additional runtime validation when proto-level validation is already configured via build.rs range constraints.
📚 Learning: 2025-08-10T18:31:56.855Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/point_ops.rs:501-528
Timestamp: 2025-08-10T18:31:56.855Z
Learning: In Qdrant, batch operations validate that `ids`, `vectors`, and `payloads` (if present) have matching lengths at the REST API level in `lib/api/src/rest/validate.rs` through the `Validate` trait implementation for `Batch`. This validation happens before data is converted to internal structures like `PointInsertOperationsInternal`, so methods operating on these internal structures can safely assume the lengths match.
Applied to files:
lib/api/src/grpc/validate.rs
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.
Applied to files:
lib/api/src/grpc/validate.rslib/api/src/grpc/proto/common.protolib/api/src/grpc/proto/points.protolib/collection/src/operations/conversions.rslib/api/src/grpc/proto/points_internal_service.protolib/api/src/grpc/proto/collections.protolib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.
Applied to files:
lib/api/src/grpc/validate.rslib/api/src/grpc/proto/common.protolib/collection/src/operations/conversions.rslib/api/src/grpc/qdrant.rs
📚 Learning: 2025-10-16T13:00:46.809Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 7368
File: lib/shard/src/query/mod.rs:313-341
Timestamp: 2025-10-16T13:00:46.809Z
Learning: In the Qdrant codebase, for the `Feedback` query variant in `lib/shard/src/query/mod.rs`, validation for sparse vectors should happen down the line rather than early in the `query_enum_from_grpc_raw_query` function. The `Nearest` variant is treated as a special case where early validation with better error messages is acceptable.
Applied to files:
lib/api/src/grpc/validate.rslib/collection/src/operations/conversions.rslib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-10T18:26:12.443Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: docs/redoc/master/openapi.json:13645-13652
Timestamp: 2025-08-10T18:26:12.443Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated from the REST schemas. To change field docs, edit lib/api/src/rest/schema.rs (e.g., add doc comments or #[schemars(description = ...)]). Specifically, UpdateVectors.update_filter lacked a description and should state: "If specified, only update vectors for points that match this filter; points not matching the filter are left unchanged."
Applied to files:
lib/api/src/grpc/proto/common.protolib/api/src/grpc/proto/points.protolib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-23T22:24:44.276Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7100
File: tests/openapi/test_fts.py:255-263
Timestamp: 2025-08-23T22:24:44.276Z
Learning: In Qdrant's OpenAPI schema, the `filter.must` field accepts either a single Condition object or an array of Condition objects, not just arrays. The schema uses `anyOf` to allow both formats.
Applied to files:
lib/api/src/grpc/proto/common.protolib/api/src/grpc/proto/points.protolib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-15T15:56:36.821Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 7065
File: lib/collection/src/operations/universal_query/shard_query.rs:0-0
Timestamp: 2025-08-15T15:56:36.821Z
Learning: Internal gRPC APIs in Qdrant don't require additional runtime validation when proto-level validation is already configured via build.rs range constraints.
Applied to files:
lib/api/src/grpc/proto/points.protolib/api/src/grpc/proto/points_internal_service.protolib/api/src/grpc/proto/collections.protolib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-11T07:57:01.399Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 6986
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:81-84
Timestamp: 2025-08-11T07:57:01.399Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the BitsStoreType parameter difference between single-vector and multi-vector Binary quantization is intentional: single-vector storage uses `EncodedVectorsBin<u128, ...>` to enable 128-bit SIMD/popcount optimizations for longer vectors, while multi-vector storage uses `EncodedVectorsBin<u8, ...>` because multivectors are typically shorter and benefit from byte-granular storage.
Applied to files:
lib/collection/src/operations/conversions.rs
📚 Learning: 2025-08-10T18:26:33.017Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: docs/redoc/master/openapi.json:10393-10401
Timestamp: 2025-08-10T18:26:33.017Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated. Any description changes must be made in the source (e.g., lib/api/src/rest/schema.rs and/or lib/api/src/grpc/proto/points.proto), not edited directly in the JSON.
Applied to files:
lib/api/src/grpc/proto/points_internal_service.protolib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-10T18:25:16.206Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: docs/redoc/master/openapi.json:10626-10634
Timestamp: 2025-08-10T18:25:16.206Z
Learning: In Qdrant, docs/redoc/master/openapi.json is autogenerated from the Rust REST schema definitions (e.g., lib/api/src/rest/schema.rs). To change OpenAPI descriptions, update the Rust struct field doc comments/attributes rather than editing the JSON directly.
Applied to files:
lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (3)
lib/api/src/grpc/validate.rs (1)
lib/collection/src/operations/cluster_ops.rs (5)
validate(110-127)validate(212-223)validate(240-247)validate(264-271)validate(275-282)
lib/collection/src/operations/conversions.rs (1)
lib/api/src/grpc/conversions.rs (17)
try_from(96-127)try_from(406-416)try_from(436-443)try_from(448-462)try_from(467-477)try_from(482-488)try_from(494-521)try_from(526-567)try_from(573-587)try_from(592-598)try_from(603-613)try_from(618-625)try_from(631-658)try_from(664-705)try_from(731-753)try_from(892-918)convert_shard_key_from_grpc(78-84)
lib/api/src/grpc/qdrant.rs (2)
lib/api/src/grpc/validate.rs (8)
validate(14-14)validate(19-21)validate(29-31)validate(39-41)validate(49-54)validate(58-64)validate(68-74)validate(78-85)lib/segment/src/types.rs (5)
validate(869-875)validate(1790-1800)validate(2118-2129)validate(2160-2167)validate(3308-3319)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: test-consensus-compose
- GitHub Check: e2e-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consistency
- GitHub Check: integration-tests
- GitHub Check: lint
- GitHub Check: Build Qdrant Edge Python bindings
🔇 Additional comments (15)
lib/api/src/grpc/proto/points_internal_service.proto (1)
4-4: LGTM!The import of
common.protoappropriately provides access to shared types likeFilterandPointIdfor internal service usage.lib/api/src/grpc/validate.rs (1)
111-111: LGTM!The dispatch of
ReplicatePointsvalidation is correctly integrated into the operation validation flow.lib/collection/src/operations/conversions.rs (3)
21-21: LGTM!The
Filtertype import is appropriately added to support the optional filter field inReplicatePoints.
27-27: LGTM!The imports for
ReplicatePointsandReplicatePointsOperationare correctly added to support the new cluster operation type.
1682-1704: LGTM!The conversion logic for
ReplicatePointsis well-structured:
- Properly handles optional filter with
map(Filter::try_from).transpose()- Validates presence of required shard keys with clear error messages
- Consistently uses
convert_shard_key_from_grpchelper matching the pattern used elsewhere in the codebaselib/api/src/grpc/proto/points.proto (1)
7-7: LGTM!The import of
common.protoappropriately provides access to the refactored types (PointId,Filter,GeoPoint, etc.) that were moved from this file to enable cross-service reuse.lib/api/src/grpc/proto/collections.proto (5)
7-7: LGTM!The import of
common.protoprovides access toFilterand other shared types required byReplicatePointsand the collection configuration surface.
20-70: LGTM!The comprehensive vector configuration types (
VectorParams,VectorParamsDiff,VectorsConfig,SparseVectorParams, etc.) are well-structured and provide a complete surface for vector configuration and updates.
732-736: LGTM!The
ReplicatePointsmessage is well-defined with:
- Required source and target shard keys
- Optional filter for selective replication
The structure aligns with the PR objectives and enables filtered point replication between shards.
738-743: LGTM!The
ShardTransferMethodenum extension withWalDeltaandReshardingStreamRecordsis backward-compatible and appropriately numbered.
772-772: LGTM!The
replicate_pointsfield is correctly added to theUpdateCollectionClusterSetupRequestoneof with an appropriate field number (10).lib/api/src/grpc/qdrant.rs (4)
1-20: qdrant.rs is prost-generated; keep changes in proto filesThis file is marked as generated by
prost-build. Any future adjustments to these messages or services should be made in the.protosources (e.g.lib/api/src/grpc/proto/common.proto,collections.proto,points.proto) and regenerated, not edited here. Based on learnings.
102-423: Shared PointId / filter / geo / match schema looks coherent and validator wiring is consistentThe new common messages (
PointId,Filter,Condition+condition::ConditionOneOf,FieldCondition,Matchand its variants, geo types,DatetimeRange,ValuesCount, etc.) form a clean, composable filter DSL and consistently usevalidator::Validate/#[validate(nested)]and the existing custom validators incrate::grpc::validate. This matches the existing validation approach and should be safe to reuse across services.
6823-6981: Result/response types refactor to RetrievedPoint/BatchResult/usage is consistent with richer point schemaThe new/updated response messages (
ScrollResponse,CountResult,RetrievedPoint,GetResponse,RecommendResponse+ batch/group variants,Discover*Response,UpdateBatchResponse,FacetResponse,SearchMatrix*Response) consistently:
- Reuse
RetrievedPoint/ScoredPointwithVectorsOutput,ShardKey, andOrderValue.- Attach
usage: Option<Usage>in line withSearchResponse/QueryResponse.The field tags and shapes are coherent and look backwards‑compatible for clients that only cared about the earlier subsets.
4531-4537: Verification complete: Factor variant is properly integrated with exhaustive handling and bounds checkingThe
Factor(u64)variant is correctly implemented across all code paths:
- Validation (
consistency_params.rs:33–47): Rejectsfactor=0, accepts valid values- Bounds conversion (
consistency_params.rs:87–90): Safely converts gRPCu64 → usizewith error handling- Internal usage (
execute_read_operation.rs:91–93): Appliesclamp(1, total_count)to ensure valid replica counts- Reverse conversion (
consistency_params.rs:101–104): Converts internalusize → u64safelyRust's exhaustive pattern matching ensures all enum variants are handled at compile time—if any match statement were incomplete, the code would not compile. Wire compatibility is preserved with tags 1 and 2.
| package qdrant; | ||
|
|
||
| option csharp_namespace = "Qdrant.Client.Grpc"; | ||
| option java_outer_classname = "Points"; |
There was a problem hiding this comment.
Verify the java_outer_classname configuration.
The java_outer_classname is set to "Points" but this is common.proto, not points.proto. This may cause confusion or conflicts when both proto files are compiled together in Java.
Consider updating to:
-option java_outer_classname = "Points";
+option java_outer_classname = "Common";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| option java_outer_classname = "Points"; | |
| option java_outer_classname = "Common"; |
🤖 Prompt for AI Agents
In lib/api/src/grpc/proto/common.proto around line 5, the option
java_outer_classname is incorrectly set to "Points" which can conflict with
points.proto when generating Java classes; change the option to a name that
matches this file and is unique (for example "Common" or "CommonProto"), update
the line to the chosen identifier, and recompile/regenerate Java sources to
ensure no classname collisions across protos.
There was a problem hiding this comment.
conflicts when both proto files are compiled together in Java.
It did indeed.
There was a problem hiding this comment.
@Anush008, thank you for confirming! I'm glad we caught this before it caused further issues. The fix to rename it to "Common" should resolve the Java compilation conflict. 👍
🧠 Learnings used
Learnt from: generall
Repo: qdrant/qdrant PR: 7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.
Learnt from: coszio
Repo: qdrant/qdrant PR: 7065
File: lib/collection/src/operations/universal_query/shard_query.rs:0-0
Timestamp: 2025-08-15T15:56:36.821Z
Learning: Internal gRPC APIs in Qdrant don't require additional runtime validation when proto-level validation is already configured via build.rs range constraints.
| impl Validate for grpc::ReplicatePoints { | ||
| fn validate(&self) -> Result<(), ValidationErrors> { | ||
| Ok(()) | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add validation for shard key differences.
The ReplicatePoints validation currently allows from_shard_key and to_shard_key to be identical, which doesn't make sense for a replication operation. Similar operations like MoveShard (lines 116-125) and ReplicateShard (lines 127-136) validate that source and destination are different using validate_shard_different_peers.
Consider adding validation:
impl Validate for grpc::ReplicatePoints {
fn validate(&self) -> Result<(), ValidationErrors> {
- Ok(())
+ if self.from_shard_key == self.to_shard_key {
+ let mut errors = ValidationErrors::new();
+ errors.add(
+ "to_shard_key",
+ ValidationError::new("from_shard_key and to_shard_key must be different"),
+ );
+ return Err(errors);
+ }
+ Ok(())
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl Validate for grpc::ReplicatePoints { | |
| fn validate(&self) -> Result<(), ValidationErrors> { | |
| Ok(()) | |
| } | |
| } | |
| impl Validate for grpc::ReplicatePoints { | |
| fn validate(&self) -> Result<(), ValidationErrors> { | |
| if self.from_shard_key == self.to_shard_key { | |
| let mut errors = ValidationErrors::new(); | |
| errors.add( | |
| "to_shard_key", | |
| ValidationError::new("from_shard_key and to_shard_key must be different"), | |
| ); | |
| return Err(errors); | |
| } | |
| Ok(()) | |
| } | |
| } |
🤖 Prompt for AI Agents
In lib/api/src/grpc/validate.rs around lines 185 to 189, the Validate impl for
grpc::ReplicatePoints currently returns Ok(()) unconditionally; add the same
shard-key-difference validation used by MoveShard and ReplicateShard by calling
the existing helper (validate_shard_different_peers or equivalent) to ensure
from_shard_key != to_shard_key and return a ValidationErrors if they are equal;
preserve existing error structure and messages consistent with the other
validations.
* implement ReplicatePoints in grpc * move filter into a separate file
Signed-off-by: Anush008 <[email protected]>
Signed-off-by: Anush008 <[email protected]>
* 1.16.0 regen Signed-off-by: Anush008 <[email protected]> * BRANCH=dev sh internal/tools/sync_proto.sh Signed-off-by: Anush008 <[email protected]> * regen Signed-off-by: Anush008 <[email protected]> * regen Signed-off-by: Anush008 <[email protected]> * 1.16.0: Part 2 (#93) * refactor: Update to use new vector fields Signed-off-by: Anush008 <[email protected]> * feat: VectorOutput parser Signed-off-by: Anush008 <[email protected]> * chore: NewMatchTextAny Signed-off-by: Anush008 <[email protected]> * feat: Added new RRF Query helper. Signed-off-by: Anush008 <[email protected]> --------- Signed-off-by: Anush008 <[email protected]> * Make sync script compatible with Linux and macOS flavors of sed * Bump proto definitions * bump test container version * chore: Re-gen due qdrant/qdrant#7536 Signed-off-by: Anush008 <[email protected]> * feat: Expose Collection cluster setup ops to high-level client Signed-off-by: Anush008 <[email protected]> --------- Signed-off-by: Anush008 <[email protected]> Co-authored-by: timvisee <[email protected]> Co-authored-by: Arnaud Gourlay <[email protected]>
* v1.16.0 Signed-off-by: Anush008 <[email protected]> * chore: Use dev Signed-off-by: Anush008 <[email protected]> * test: Use list collections to check for auth instead of health check Signed-off-by: Anush008 <[email protected]> * feat: Added VectorOutputHelper Signed-off-by: Anush008 <[email protected]> * matchTextAny Signed-off-by: Anush008 <[email protected]> * feat: Added new RRF Query helper. Signed-off-by: Anush008 <[email protected]> * chore: Added one overload for updateVectors null timeout Signed-off-by: Anush008 <[email protected]> * chore: Updated imports as per qdrant/qdrant#7536 Signed-off-by: Anush008 <[email protected]> * chore: Bump image and protos to v1.16.0 Signed-off-by: Anush008 <[email protected]> * feat: Expose getCollectionClusterInfoAsync and getCollectionClusterInfoAsync to high-level client Signed-off-by: Anush008 <[email protected]> --------- Signed-off-by: Anush008 <[email protected]>
Unfortunatelly, had to move filter structures into common.proto