fix: validate vector dimensions before WAL write for async upserts#9058
Conversation
When upserting points with wait=false (the default), dimension mismatches were silently discarded during background processing. The API returned 200 "acknowledged" but the points were never stored, causing silent data loss with no error feedback to the user. This adds an early dimension validation check in do_upsert_points() that runs before the operation is written to WAL. This ensures that dimension errors are returned to the client regardless of the wait parameter, matching the behavior of wait=true. The validation handles all vector types: - Dense single vectors - Multi-dense vectors - Named vectors (dense, multi-dense, sparse) - Sparse vectors are skipped (no fixed dimension) Closes #9039 Co-authored-by: Cursor <[email protected]>
Extract validate_vector_dimensions and helper functions from update.rs into src/common/validate_vectors.rs for better code organization. Co-authored-by: Cursor <[email protected]>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds Collection::vectors_config() accessor; introduces common.validate_vectors with validate_vector_dimensions and helpers for batch/point upserts (single, multidense, named, sparse); calls this validator early in do_upsert_points after unlogged collection access and before WAL/async processing to return StorageError::bad_input on dimension mismatch; adds regression tests exercising sync/async upsert dimension validation. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/common/update.rs (1)
611-658: ⚡ Quick winApply the same pre-WAL dimension check to
do_update_vectors.
do_upsert_pointsnow rejects bad dimensions before queueing, but/points/vectorsstill converts and forwards straight toupdate(). If dimension errors forVectorOperations::UpdateVectorsare still raised only during WAL apply,wait=falsewill keep the old “acknowledged but dropped” behavior on this endpoint. Please verify that path, or reuse the same early validation here and add a regression test.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/common/update.rs` around lines 611 - 658, do_update_vectors currently converts point vectors via convert_point_vectors and forwards them into UpdateVectorsOp and update() without the pre-WAL dimension validation used by do_upsert_points; add the same early dimension check after convert_point_vectors and before constructing CollectionUpdateOperations::VectorOperation so dimension mismatches are rejected synchronously (i.e., call the shared validation helper used by do_upsert_points or implement the same validation logic against the collection schema), return the appropriate StorageError on mismatch, and add a regression test exercising /points/vectors with wait=false to ensure bad-dimension updates are rejected before queuing.tests/openapi/test_vector_dimension_validation.py (1)
34-105: ⚡ Quick winAdd a named-vector regression case.
The Rust change added separate validation paths for named vectors, but this module only exercises default-vector payloads. One named-vector failure case here would keep that new branch from drifting untested.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/openapi/test_vector_dimension_validation.py` around lines 34 - 105, Add a new test similar to test_sync_upsert_wrong_dimension_rejected but exercising the named-vector payload branch: create test_named_vector_wrong_dimension_rejected(collection_name) that sends a PUT to '/collections/{collection_name}/points' with query_params {'wait':'true'} and a body where the point contains a named vector (e.g. "vector": {"name": "myvec", "values": [0.1,0.2,0.3]} into a 4-dim collection) and assert response.ok is False, response.status_code == 400, and that "dimension" appears in response.json()["status"]["error"].lower(); this ensures the named-vector validation path is covered alongside existing tests (see test_sync_upsert_wrong_dimension_rejected and test_async_batch_upsert_wrong_dimension_rejected for payload structure and assertions).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/common/update.rs`:
- Around line 611-658: do_update_vectors currently converts point vectors via
convert_point_vectors and forwards them into UpdateVectorsOp and update()
without the pre-WAL dimension validation used by do_upsert_points; add the same
early dimension check after convert_point_vectors and before constructing
CollectionUpdateOperations::VectorOperation so dimension mismatches are rejected
synchronously (i.e., call the shared validation helper used by do_upsert_points
or implement the same validation logic against the collection schema), return
the appropriate StorageError on mismatch, and add a regression test exercising
/points/vectors with wait=false to ensure bad-dimension updates are rejected
before queuing.
In `@tests/openapi/test_vector_dimension_validation.py`:
- Around line 34-105: Add a new test similar to
test_sync_upsert_wrong_dimension_rejected but exercising the named-vector
payload branch: create
test_named_vector_wrong_dimension_rejected(collection_name) that sends a PUT to
'/collections/{collection_name}/points' with query_params {'wait':'true'} and a
body where the point contains a named vector (e.g. "vector": {"name": "myvec",
"values": [0.1,0.2,0.3]} into a 4-dim collection) and assert response.ok is
False, response.status_code == 400, and that "dimension" appears in
response.json()["status"]["error"].lower(); this ensures the named-vector
validation path is covered alongside existing tests (see
test_sync_upsert_wrong_dimension_rejected and
test_async_batch_upsert_wrong_dimension_rejected for payload structure and
assertions).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5003ad24-f054-4232-a103-d9074dd5b710
📒 Files selected for processing (3)
lib/collection/src/collection/collection_ops.rssrc/common/update.rstests/openapi/test_vector_dimension_validation.py
The test expected a shard-level error message, but now dimension mismatches are caught before reaching the shards. Update the assertion to accept either the early validation error or the shard-level error. Co-authored-by: Cursor <[email protected]>
Check for the descriptive error ("Vector dimension error: expected dim: 4, got 3")
rather than the generic shard failure wrapper.
Co-authored-by: Cursor <[email protected]>
…9058) * fix: validate vector dimensions before WAL write for async upserts When upserting points with wait=false (the default), dimension mismatches were silently discarded during background processing. The API returned 200 "acknowledged" but the points were never stored, causing silent data loss with no error feedback to the user. This adds an early dimension validation check in do_upsert_points() that runs before the operation is written to WAL. This ensures that dimension errors are returned to the client regardless of the wait parameter, matching the behavior of wait=true. The validation handles all vector types: - Dense single vectors - Multi-dense vectors - Named vectors (dense, multi-dense, sparse) - Sparse vectors are skipped (no fixed dimension) Closes #9039 Co-authored-by: Cursor <[email protected]> * refactor: move vector dimension validation into dedicated module Extract validate_vector_dimensions and helper functions from update.rs into src/common/validate_vectors.rs for better code organization. Co-authored-by: Cursor <[email protected]> * fix: update shard update test for early dimension validation The test expected a shard-level error message, but now dimension mismatches are caught before reaching the shards. Update the assertion to accept either the early validation error or the shard-level error. Co-authored-by: Cursor <[email protected]> * fix: assert actual dimension error message in shard update test Check for the descriptive error ("Vector dimension error: expected dim: 4, got 3") rather than the generic shard failure wrapper. Co-authored-by: Cursor <[email protected]> --------- Co-authored-by: Cursor Agent <[email protected]> Co-authored-by: Cursor <[email protected]>
Summary
Fixes #9039
When upserting points with
wait=false(the default), vector dimension mismatches were silently discarded during background WAL processing. The API returned HTTP 200 with"status": "acknowledged"but the points were never actually stored — causing silent data loss with no error feedback to the user.This PR adds an early dimension validation check in
do_upsert_points()that runs before the operation is written to WAL. This ensures dimension errors are returned to the client regardless of thewaitparameter.Changes
src/common/update.rs: Addedvalidate_vector_dimensions()and helper functions that check vector dimensions against the collection's configured vector size before the operation reaches the WAL. Handles all vector types (dense, multi-dense, named, sparse).lib/collection/src/collection/collection_ops.rs: Added publicvectors_config()accessor method onCollectionto expose the vectors configuration for early validation.tests/openapi/test_vector_dimension_validation.py: Added integration tests verifying that dimension mismatches are rejected with HTTP 400 for bothwait=trueandwait=falseupserts (point list and batch formats).Design decisions
update()writes to WAL. This is the earliest point where we have both the resolved vectors and access to the collection config.RwLockread of the collection config per upsert request — the same lock already acquired by strict mode checks on the same code path."Vector dimension error: expected dim: X, got Y") for consistency.Test plan
test_vector_dimension_validation.pyverifies:wait=true+ wrong dimension → HTTP 400 (was already working)wait=false+ wrong dimension → HTTP 400 (was previously HTTP 200)wait=false+ correct dimension → HTTP 200 (no regression)Made with Cursor