Skip to content

Incomplete metrics for failed API responses #7927

@jkbe

Description

@jkbe

Current Behavior

If an API request fails, the metrics rest_responses_fail_total (in case of an HTTP request) or grpc_responses_fail_total (in case of a gRPC request) are not incremented but remain at 0. Therefore, failed responses cannot be monitored using these metrics.

In case of HTTP, we can work around this for now and monitor failed responses by looking at the metric rest_responses_total and filtering for the status label to be 4xx/5xx. However, grpc_responses_total has no status label so there currently doesn't seem to be a way at all to monitor failed gRPC responses via the /metrics endpoint. Successful and unsuccessful responses cannot be distinguished here.

Steps to Reproduce

Following the quick start with Qdrant v1.16.3 and Python qdrant-client v1.16.2.

  1. Start Qdrant:
docker run -p 6333:6333 -p 6334:6334 \
    -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
    qdrant/qdrant:v1.16.3
  1. Run a points query against a non-existent collection. Once via HTTP and once via gRPC. With uv run example.py:
# file: example.py
#
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "qdrant-client==1.16.2",
# ]
# ///

import traceback

from grpc import RpcError
from qdrant_client import QdrantClient
from qdrant_client.http.exceptions import UnexpectedResponse

http_client = QdrantClient(url="http://localhost:6333", prefer_grpc=False)
try:
    http_client.query_points(collection_name="non_existent_collection", query=[0.1])
except UnexpectedResponse:
    print(traceback.format_exc())

grpc_client = QdrantClient(url="http://localhost:6333", prefer_grpc=True)
try:
    grpc_client.query_points(collection_name="non_existent_collection", query=[0.1])
except RpcError:
    print(traceback.format_exc())
  1. Check http://localhost:6333/metrics. The relevant excerpt look similar to what's below.
  • rest_responses_total=1 , rest_responses_fail_total=0
  • grpc_responses_total=1 (without any indication of request success/failure), grpc_responses_fail_total=0
# HELP rest_responses_total total number of responses
# TYPE rest_responses_total counter
rest_responses_total{method="POST",endpoint="/collections/{name}/points/query",status="404"} 1
# HELP rest_responses_fail_total total number of failed responses
# TYPE rest_responses_fail_total counter
rest_responses_fail_total{method="POST",endpoint="/collections/{name}/points/query",status="404"} 0
# HELP grpc_responses_total total number of responses
# TYPE grpc_responses_total counter
grpc_responses_total{endpoint="/qdrant.Points/Query"} 1
# HELP grpc_responses_fail_total total number of failed responses
# TYPE grpc_responses_fail_total counter
grpc_responses_fail_total{endpoint="/qdrant.Points/Query"} 0

Expected Behavior

  • Metric rest_responses_fail_total should be incremented with each failed HTTP request (response status codes 4xx/5xx).
  • Metric grpc_responses_fail_total should be incremented with each failed gRPC request.
  • (As an extra, it would be valuable to indicate the gRPC status code via a label for grpc_responses_* metrics.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions