test: reproduce MatchAny(any=[]) strict-mode rejection on integer index#9260
Conversation
Adds an openapi test that demonstrates the strict-mode bug where
`match: {"any": []}` on an integer-indexed payload field is rejected with:
Bad request: Index required but not found for "<field>" of one of
the following types: [keyword, uuid]
even though the field is indexed as integer. Root cause is that an empty
`any` list deserializes as `AnyVariants::Strings(empty)` (untagged enum;
Strings variant listed first), and strict mode infers a keyword/uuid
index requirement from the variant tag — ignoring that the list is
empty (i.e. a no-op condition).
The test covers:
- Baseline no-op semantics with and without indexes (passes today).
- Reproduction under strict mode for `must`, `must_not`, and a
FormulaQuery FieldCondition (currently failing; will pass once
empty `any`/`except` no longer requires an index).
Co-authored-by: Cursor <[email protected]>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis pull request adds pytest tests that validate MatchAny(any=[]) behavior across unindexed and indexed integer/keyword/uuid fields, plus strict-mode regression tests covering must, must_not, and formula-query placements. It also updates Rust logic to treat empty Any/Except variants as no-ops and to skip unindexed-field inference when no index is required. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/openapi/test_match_any_empty.py (1)
102-109: ⚡ Quick winInconsistent error handling pattern.
All other helpers in this file use
assert response.okfor validation, but_set_strict_modeusesresponse.raise_for_status(). While both approaches work, consistency improves maintainability.♻️ Align with the pattern used by other helpers
def _set_strict_mode(collection_name, strict_mode_config): response = request_with_validation( api="/collections/{collection_name}", method="PATCH", path_params={"collection_name": collection_name}, body={"strict_mode_config": strict_mode_config}, ) - response.raise_for_status() + assert response.ok🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/openapi/test_match_any_empty.py` around lines 102 - 109, The helper _set_strict_mode currently calls response.raise_for_status(); change it to follow the file's consistent pattern by replacing that call with an assertion using assert response.ok (and optionally include a message), so the function aligns with other helpers' error handling style.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/openapi/test_match_any_empty.py`:
- Around line 102-109: The helper _set_strict_mode currently calls
response.raise_for_status(); change it to follow the file's consistent pattern
by replacing that call with an assertion using assert response.ok (and
optionally include a message), so the function aligns with other helpers' error
handling style.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: a5e4a6d7-1d7b-4085-99c0-34c83865682b
📒 Files selected for processing (1)
tests/openapi/test_match_any_empty.py
…heck
An empty `match: {"any": []}` (or `{"except": []}`) is a no-op: `any: []`
matches nothing and `except: []` excludes nothing, regardless of the
field's data type. Because an empty list cannot carry type information it
deserializes as the keyword `AnyVariants::Strings(empty)` variant, which
previously caused strict mode to demand a keyword/uuid index and reject
the request with:
Index required but not found for "<field>" of one of the following
types: [keyword, uuid]
even on a field indexed as integer.
Fix:
- `infer_index_from_any_variants` returns no required index for an empty
variant set.
- `Extractor::update_from_condition` skips a condition whose required
index set is empty, so a no-op condition is never reported as needing
an index (this also covers the FormulaQuery condition path).
This makes the openapi reproduction in test_match_any_empty.py pass.
Co-authored-by: Cursor <[email protected]>
|
Added the fix as a follow-up commit (886ce9f). Fix (
An empty Verification:
|
…ex (#9260) * test: reproduce MatchAny(any=[]) strict-mode rejection on integer index Adds an openapi test that demonstrates the strict-mode bug where `match: {"any": []}` on an integer-indexed payload field is rejected with: Bad request: Index required but not found for "<field>" of one of the following types: [keyword, uuid] even though the field is indexed as integer. Root cause is that an empty `any` list deserializes as `AnyVariants::Strings(empty)` (untagged enum; Strings variant listed first), and strict mode infers a keyword/uuid index requirement from the variant tag — ignoring that the list is empty (i.e. a no-op condition). The test covers: - Baseline no-op semantics with and without indexes (passes today). - Reproduction under strict mode for `must`, `must_not`, and a FormulaQuery FieldCondition (currently failing; will pass once empty `any`/`except` no longer requires an index). Co-authored-by: Cursor <[email protected]> * fix: treat empty MatchAny/MatchExcept as no-op in strict-mode index check An empty `match: {"any": []}` (or `{"except": []}`) is a no-op: `any: []` matches nothing and `except: []` excludes nothing, regardless of the field's data type. Because an empty list cannot carry type information it deserializes as the keyword `AnyVariants::Strings(empty)` variant, which previously caused strict mode to demand a keyword/uuid index and reject the request with: Index required but not found for "<field>" of one of the following types: [keyword, uuid] even on a field indexed as integer. Fix: - `infer_index_from_any_variants` returns no required index for an empty variant set. - `Extractor::update_from_condition` skips a condition whose required index set is empty, so a no-op condition is never reported as needing an index (this also covers the FormulaQuery condition path). This makes the openapi reproduction in test_match_any_empty.py pass. Co-authored-by: Cursor <[email protected]> --------- Co-authored-by: Cursor <[email protected]>
Summary
Reproduces a strict-mode bug where
FieldCondition(key=\"f\", match=MatchAny(any=[]))on an integer-indexed payload field is rejected with:even though the field has an integer payload index. Replacing
any=[]withany=[1]succeeds.Root cause (analysis)
AnyVariantsinlib/segment/src/types.rsis#[serde(untagged)]with theStringsvariant listed first, so an empty[]deserializes asAnyVariants::Strings(∅). Over gRPC, an emptyMatchAnyis serialized by the Python client asMatch { keywords: RepeatedStrings { strings: [] } }, which also maps toAnyVariants::Strings(∅)inlib/api/src/grpc/conversions.rs.Strict mode then infers the required index type purely from the variant tag in
lib/collection/src/problems/unindexed_field.rs::infer_index_from_any_variants— without checking whether the list is empty — and demands a keyword/uuid index.At the execution layer the condition already behaves as a no-op (matches nothing) thanks to the
_ => falsefall-through incondition_checker.rs, so the strict-mode pre-check is the only place where the empty case is mis-handled.Expected: an empty
MatchAny/MatchExceptshould be a no-op for strict mode regardless of the field's index type (or, failing that, not infer a keyword/uuid requirement out of nothing).What this PR contains
Only a new test file
tests/openapi/test_match_any_empty.py(no fix) so the bug is captured and CI tracks it. The file mirrors the existingtest_match_except_empty.pyand adds three strict-mode reproductions:test_match_any_empty_with_strict_mode_integer_index— plainmustfiltertest_match_any_empty_with_strict_mode_in_must_not—must_notfiltertest_match_any_empty_with_strict_mode_in_formula_query—FormulaQueryFieldConditionLocally, baseline tests pass (12) and the three strict-mode tests fail with the exact reported error message, confirming reproduction.
Test plan
pytest tests/openapi/test_match_any_empty.py -v— baseline tests pass, strict-mode tests fail withIndex required but not found for \"n\" of one of the following types: [keyword, uuid](matches the report).Related to #9050 (analogous behaviour exists for
MatchExcept(except=[]), already tested intest_match_except_empty.py).Made with Cursor