Skip to content

Bug: Local memory mode incorrectly computes values_count for dict payloads #1189

@litt1e-c

Description

@litt1e-c

Describe the bug

There is a behavioral inconsistency between the Remote server (Rust engine) and the Local :memory: engine in the Python client when applying the values_count filter on dictionary (JSON object) payloads.

According to Qdrant semantics, a scalar or a JSON object/dict does not have array dimensions, so its values_count should evaluate to 1. The Remote server handles this correctly. However, the Local :memory: engine incorrectly treats the dict as an array and returns the number of keys in the dictionary (len(dict)).

Root Cause Analysis

In the local engine implementation, the logic checks if a payload value is an array using something equivalent to hasattr(value, "__len__") and not isinstance(value, str). Since a Python dict has a __len__ attribute, it falls into this branch, and its length (number of keys) is returned instead of 1.

Reproducible Code

Here is a minimal reproducible example demonstrating the discrepancy. The remote server correctly passes the assertions, while the local mode fails.

from qdrant_client import QdrantClient
from qdrant_client.models import (
    Distance,
    FieldCondition,
    Filter,
    PointStruct,
    ValuesCount,
    VectorParams,
)

HOST = "127.0.0.1"
PORT = 6333
GRPC_PORT = 6334

COLLECTION = "values_count_dict_repro"


def ids_from_scroll(client, flt):
    points, _ = client.scroll(
        collection_name=COLLECTION,
        scroll_filter=flt,
        limit=20,
        with_payload=True,
        with_vectors=False,
    )
    return sorted([p.id for p in points])


def main():
    remote = QdrantClient(host=HOST, port=PORT, grpc_port=GRPC_PORT, prefer_grpc=False)
    local = QdrantClient(":memory:")

    for client in (remote, local):
        client.recreate_collection(
            collection_name=COLLECTION,
            vectors_config=VectorParams(size=2, distance=Distance.COSINE),
        )
        client.upsert(
            collection_name=COLLECTION,
            wait=True,
            points=[
                # Scalar dict: should be treated as a single element per docs, count == 1
                PointStruct(id=1, vector=[0.1, 0.2], payload={"vc": {"a": 1, "b": 2}}),
                # Actual array: count == 2
                PointStruct(id=2, vector=[0.2, 0.3], payload={"vc": ["x", "y"]}),
                # Normal scalar: count == 1
                PointStruct(id=3, vector=[0.3, 0.4], payload={"vc": 7}),
            ],
        )

    gt_1 = Filter(
        must=[
            FieldCondition(
                key="vc",
                values_count=ValuesCount(gt=1),
            )
        ]
    )
    lte_1 = Filter(
        must=[
            FieldCondition(
                key="vc",
                values_count=ValuesCount(lte=1),
            )
        ]
    )

    remote_gt = ids_from_scroll(remote, gt_1)
    remote_lte = ids_from_scroll(remote, lte_1)
    local_gt = ids_from_scroll(local, gt_1)
    local_lte = ids_from_scroll(local, lte_1)

    print("=== Remote server ===")
    print("values_count(gt=1): ", remote_gt)
    print("values_count(lte=1):", remote_lte)

    print("\n=== Local :memory: ===")
    print("values_count(gt=1): ", local_gt)
    print("values_count(lte=1):", local_lte)

    print("\n=== Expected if docs/server semantics hold ===")
    print("dict should count as 1, so:")
    print("gt=1  should match only id=2")
    print("lte=1 should match ids=1,3")

    # Assertions: remote will pass, local will currently fail
    assert remote_gt == [2], f"Unexpected remote gt=1 result: {remote_gt}"
    assert remote_lte == [1, 3], f"Unexpected remote lte=1 result: {remote_lte}"

    # This will trigger an AssertionError under Local mode due to the bug
    assert local_gt ==[2], f"local gt=1 result is {local_gt}"
    assert local_lte ==[1, 3], f"local lte=1 result is {local_lte}"


if __name__ == "__main__":
    main()


Expected behavior
The behavior of the Local :memory: engine should be identical to the Remote Qdrant server. Dictionaries (dict / JSON objects) should not be treated as arrays and their values_count should be evaluated as 1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions