Skip to content

fix: validate vector dimensions before WAL write for async upserts#9058

Merged
generall merged 4 commits into
devfrom
fix/early-vector-dimension-validation
May 16, 2026
Merged

fix: validate vector dimensions before WAL write for async upserts#9058
generall merged 4 commits into
devfrom
fix/early-vector-dimension-validation

Conversation

@qdrant-cloud-bot
Copy link
Copy Markdown
Contributor

Summary

Fixes #9039

When upserting points with wait=false (the default), vector dimension mismatches were silently discarded during background WAL processing. The API returned HTTP 200 with "status": "acknowledged" but the points were never actually stored — causing silent data loss with no error feedback to the user.

This PR adds an early dimension validation check in do_upsert_points() that runs before the operation is written to WAL. This ensures dimension errors are returned to the client regardless of the wait parameter.

Changes

  • src/common/update.rs: Added validate_vector_dimensions() and helper functions that check vector dimensions against the collection's configured vector size before the operation reaches the WAL. Handles all vector types (dense, multi-dense, named, sparse).
  • lib/collection/src/collection/collection_ops.rs: Added public vectors_config() accessor method on Collection to expose the vectors configuration for early validation.
  • tests/openapi/test_vector_dimension_validation.py: Added integration tests verifying that dimension mismatches are rejected with HTTP 400 for both wait=true and wait=false upserts (point list and batch formats).

Design decisions

  • Validation placement: The check happens after inference/conversion (so embedding dimensions are resolved) but before update() writes to WAL. This is the earliest point where we have both the resolved vectors and access to the collection config.
  • Sparse vectors skipped: Sparse vectors don't have a fixed dimension constraint, so they're intentionally excluded from validation.
  • Performance impact: Adds one RwLock read of the collection config per upsert request — the same lock already acquired by strict mode checks on the same code path.
  • Error format: Matches the existing error message format ("Vector dimension error: expected dim: X, got Y") for consistency.

Test plan

  • Existing tests pass (the fix doesn't change behavior for valid operations)
  • New test test_vector_dimension_validation.py verifies:
    • wait=true + wrong dimension → HTTP 400 (was already working)
    • wait=false + wrong dimension → HTTP 400 (was previously HTTP 200)
    • wait=false + correct dimension → HTTP 200 (no regression)
    • Batch format + wrong dimension → HTTP 400

Made with Cursor

Cursor Agent and others added 2 commits May 15, 2026 16:28
When upserting points with wait=false (the default), dimension
mismatches were silently discarded during background processing.
The API returned 200 "acknowledged" but the points were never stored,
causing silent data loss with no error feedback to the user.

This adds an early dimension validation check in do_upsert_points()
that runs before the operation is written to WAL. This ensures that
dimension errors are returned to the client regardless of the wait
parameter, matching the behavior of wait=true.

The validation handles all vector types:
- Dense single vectors
- Multi-dense vectors
- Named vectors (dense, multi-dense, sparse)
- Sparse vectors are skipped (no fixed dimension)

Closes #9039

Co-authored-by: Cursor <[email protected]>
Extract validate_vector_dimensions and helper functions from update.rs
into src/common/validate_vectors.rs for better code organization.

Co-authored-by: Cursor <[email protected]>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d768934b-322b-4380-a45b-0e8bbb06be47

📥 Commits

Reviewing files that changed from the base of the PR and between 4c93d7c and b45b141.

📒 Files selected for processing (1)
  • tests/consensus_tests/test_collection_shard_update.py

📝 Walkthrough

Walkthrough

Adds Collection::vectors_config() accessor; introduces common.validate_vectors with validate_vector_dimensions and helpers for batch/point upserts (single, multidense, named, sparse); calls this validator early in do_upsert_points after unlogged collection access and before WAL/async processing to return StorageError::bad_input on dimension mismatch; adds regression tests exercising sync/async upsert dimension validation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

  • #9039: Addresses the async upsert silent-discard bug by validating vector dimensions before WAL/async write so mismatches return errors instead of being silently dropped.
  • #9045: Related to adding early dimension validation in the upsert flow; this PR implements the referenced validation step.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: adding vector dimension validation before WAL writes for async upserts, which is the core fix for the linked issue.
Description check ✅ Passed The description is well-related to the changeset, explaining the problem (silent data loss with async upserts), the solution (early validation), and implementation details across all modified files.
Linked Issues check ✅ Passed The PR comprehensively addresses #9039 by implementing the preferred solution: early dimension validation that returns HTTP 400 for dimension mismatches regardless of the wait parameter, eliminating silent data loss.
Out of Scope Changes check ✅ Passed All changes are directly in scope: vector dimension validation logic, collection config accessor, integration tests for the fix, and test updates reflecting the new early-validation behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/early-vector-dimension-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/common/update.rs (1)

611-658: ⚡ Quick win

Apply the same pre-WAL dimension check to do_update_vectors.

do_upsert_points now rejects bad dimensions before queueing, but /points/vectors still converts and forwards straight to update(). If dimension errors for VectorOperations::UpdateVectors are still raised only during WAL apply, wait=false will keep the old “acknowledged but dropped” behavior on this endpoint. Please verify that path, or reuse the same early validation here and add a regression test.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/common/update.rs` around lines 611 - 658, do_update_vectors currently
converts point vectors via convert_point_vectors and forwards them into
UpdateVectorsOp and update() without the pre-WAL dimension validation used by
do_upsert_points; add the same early dimension check after convert_point_vectors
and before constructing CollectionUpdateOperations::VectorOperation so dimension
mismatches are rejected synchronously (i.e., call the shared validation helper
used by do_upsert_points or implement the same validation logic against the
collection schema), return the appropriate StorageError on mismatch, and add a
regression test exercising /points/vectors with wait=false to ensure
bad-dimension updates are rejected before queuing.
tests/openapi/test_vector_dimension_validation.py (1)

34-105: ⚡ Quick win

Add a named-vector regression case.

The Rust change added separate validation paths for named vectors, but this module only exercises default-vector payloads. One named-vector failure case here would keep that new branch from drifting untested.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/openapi/test_vector_dimension_validation.py` around lines 34 - 105, Add
a new test similar to test_sync_upsert_wrong_dimension_rejected but exercising
the named-vector payload branch: create
test_named_vector_wrong_dimension_rejected(collection_name) that sends a PUT to
'/collections/{collection_name}/points' with query_params {'wait':'true'} and a
body where the point contains a named vector (e.g. "vector": {"name": "myvec",
"values": [0.1,0.2,0.3]} into a 4-dim collection) and assert response.ok is
False, response.status_code == 400, and that "dimension" appears in
response.json()["status"]["error"].lower(); this ensures the named-vector
validation path is covered alongside existing tests (see
test_sync_upsert_wrong_dimension_rejected and
test_async_batch_upsert_wrong_dimension_rejected for payload structure and
assertions).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/common/update.rs`:
- Around line 611-658: do_update_vectors currently converts point vectors via
convert_point_vectors and forwards them into UpdateVectorsOp and update()
without the pre-WAL dimension validation used by do_upsert_points; add the same
early dimension check after convert_point_vectors and before constructing
CollectionUpdateOperations::VectorOperation so dimension mismatches are rejected
synchronously (i.e., call the shared validation helper used by do_upsert_points
or implement the same validation logic against the collection schema), return
the appropriate StorageError on mismatch, and add a regression test exercising
/points/vectors with wait=false to ensure bad-dimension updates are rejected
before queuing.

In `@tests/openapi/test_vector_dimension_validation.py`:
- Around line 34-105: Add a new test similar to
test_sync_upsert_wrong_dimension_rejected but exercising the named-vector
payload branch: create
test_named_vector_wrong_dimension_rejected(collection_name) that sends a PUT to
'/collections/{collection_name}/points' with query_params {'wait':'true'} and a
body where the point contains a named vector (e.g. "vector": {"name": "myvec",
"values": [0.1,0.2,0.3]} into a 4-dim collection) and assert response.ok is
False, response.status_code == 400, and that "dimension" appears in
response.json()["status"]["error"].lower(); this ensures the named-vector
validation path is covered alongside existing tests (see
test_sync_upsert_wrong_dimension_rejected and
test_async_batch_upsert_wrong_dimension_rejected for payload structure and
assertions).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5003ad24-f054-4232-a103-d9074dd5b710

📥 Commits

Reviewing files that changed from the base of the PR and between 813a58c and 62cebb9.

📒 Files selected for processing (3)
  • lib/collection/src/collection/collection_ops.rs
  • src/common/update.rs
  • tests/openapi/test_vector_dimension_validation.py

Cursor Agent and others added 2 commits May 15, 2026 17:08
The test expected a shard-level error message, but now dimension
mismatches are caught before reaching the shards. Update the assertion
to accept either the early validation error or the shard-level error.

Co-authored-by: Cursor <[email protected]>
Check for the descriptive error ("Vector dimension error: expected dim: 4, got 3")
rather than the generic shard failure wrapper.

Co-authored-by: Cursor <[email protected]>
@generall generall merged commit 89fa850 into dev May 16, 2026
15 checks passed
@generall generall deleted the fix/early-vector-dimension-validation branch May 16, 2026 18:18
@coderabbitai coderabbitai Bot mentioned this pull request May 18, 2026
3 tasks
generall pushed a commit that referenced this pull request May 22, 2026
…9058)

* fix: validate vector dimensions before WAL write for async upserts

When upserting points with wait=false (the default), dimension
mismatches were silently discarded during background processing.
The API returned 200 "acknowledged" but the points were never stored,
causing silent data loss with no error feedback to the user.

This adds an early dimension validation check in do_upsert_points()
that runs before the operation is written to WAL. This ensures that
dimension errors are returned to the client regardless of the wait
parameter, matching the behavior of wait=true.

The validation handles all vector types:
- Dense single vectors
- Multi-dense vectors
- Named vectors (dense, multi-dense, sparse)
- Sparse vectors are skipped (no fixed dimension)

Closes #9039

Co-authored-by: Cursor <[email protected]>

* refactor: move vector dimension validation into dedicated module

Extract validate_vector_dimensions and helper functions from update.rs
into src/common/validate_vectors.rs for better code organization.

Co-authored-by: Cursor <[email protected]>

* fix: update shard update test for early dimension validation

The test expected a shard-level error message, but now dimension
mismatches are caught before reaching the shards. Update the assertion
to accept either the early validation error or the shard-level error.

Co-authored-by: Cursor <[email protected]>

* fix: assert actual dimension error message in shard update test

Check for the descriptive error ("Vector dimension error: expected dim: 4, got 3")
rather than the generic shard failure wrapper.

Co-authored-by: Cursor <[email protected]>

---------

Co-authored-by: Cursor Agent <[email protected]>
Co-authored-by: Cursor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants