Skip to content

new: run server version check in a thread, don't check bm25 availabil…#1168

Merged
joein merged 6 commits into
devfrom
async-get-server-version
Mar 13, 2026
Merged

new: run server version check in a thread, don't check bm25 availabil…#1168
joein merged 6 commits into
devfrom
async-get-server-version

Conversation

@joein

@joein joein commented Mar 13, 2026

Copy link
Copy Markdown
Member

@netlify

netlify Bot commented Mar 13, 2026

Copy link
Copy Markdown

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 884d7fa
🔍 Latest deploy log https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/69b417784fecb2000743b2f7
😎 Deploy Preview https://deploy-preview-1168--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@joein joein requested a review from tbung March 13, 2026 10:38
@coderabbitai

coderabbitai Bot commented Mar 13, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR removes local retrieval and propagation of server_version across sync and async clients and fastembed/ModelEmbedder initializers, deletes the server-version-based builtin-embedder availability check, and updates ModelEmbedder/fastembed mixins to drop the server_version parameter. Remote client classes (sync and async) now spawn a daemon background thread that runs a new static _check_compatibility to fetch and warn about client/server version compatibility. get_server_version gains a timeout parameter forwarded to HTTP requests. Tests adjust auth-token timing and remove an unused version read.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • tbung
  • generall
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: moving server version check to a thread and removing BM25 availability check logic from ModelEmbedder.
Description check ✅ Passed The description references issue #1167 and is directly related to the changeset involving server version checking and BM25 availability logic modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch async-get-server-version
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@qdrant_client/async_qdrant_remote.py`:
- Around line 190-206: The except block swallowing all exceptions should emit
the same UserWarning used when server_version is falsy so callers are informed
the compatibility check failed; update the except in the try/except around
importlib.metadata.version and get_server_version (in async_qdrant_remote.py) to
call show_warning with the same message and parameters used in the "if not
server_version" branch (or a nearly identical message noting failure to obtain
server version due to an error), and keep the logging.debug for the exception as
well so both a user-facing warning and a debug log are produced.

In `@qdrant_client/embed/model_embedder.py`:
- Around line 54-56: Currently use_core_bm25 is set from is_local_mode which is
incorrect; change the logic in model_embedder.py so
self._is_builtin_embedder_available is determined by an explicit
server-capability check (e.g., a capability flag exposed by QdrantClient or a
call that verifies core BM25 support) instead of is_local_mode, and pass that
boolean into Embedder(use_core_bm25=...); update any initialization that uses
FastEmbedMisc.is_installed() and Embedder to combine local availability and
server capability (refer to symbols self._is_builtin_embedder_available,
FastEmbedMisc.is_installed(), and Embedder(use_core_bm25=...)) so remote clients
without core BM25 won't report support or receive BM25 requests.

In `@qdrant_client/qdrant_remote.py`:
- Around line 224-260: The compatibility check currently runs in a daemon thread
started in __init__, causing warnings to be missed and stacklevel attribution to
be wrong; update the Thread creation so it's non-daemon and assign it to an
instance attribute (e.g., self._compatibility_thread) so callers/tests can join
it, and add an optional flag (e.g., check_compatibility_blocking) to __init__
that, when True, calls QdrantRemote._check_compatibility(...) synchronously
instead of spawning a thread; ensure you reference the Thread creation site in
__init__ and the _check_compatibility staticmethod when implementing these
changes so tests can observe warnings and correct stacklevel.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3701d3c-2629-42c3-a9a0-a74da5291d2c

📥 Commits

Reviewing files that changed from the base of the PR and between 0a69a4c and 1b3a8da.

📒 Files selected for processing (8)
  • qdrant_client/async_qdrant_client.py
  • qdrant_client/async_qdrant_fastembed.py
  • qdrant_client/async_qdrant_remote.py
  • qdrant_client/common/version_check.py
  • qdrant_client/embed/model_embedder.py
  • qdrant_client/qdrant_client.py
  • qdrant_client/qdrant_fastembed.py
  • qdrant_client/qdrant_remote.py
💤 Files with no reviewable changes (2)
  • qdrant_client/async_qdrant_client.py
  • qdrant_client/qdrant_client.py

Comment thread qdrant_client/async_qdrant_remote.py Outdated
Comment on lines +54 to 56
self._is_builtin_embedder_available = not is_local_mode
self._fastembed_available = FastEmbedMisc.is_installed()
self.embedder = Embedder(use_core_bm25=self._is_builtin_embedder_available, **kwargs)

@coderabbitai coderabbitai Bot Mar 13, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Restore the BM25 server-capability gate.

is_local_mode only tells us whether the client is local. For every remote client, QdrantClient passes False here (qdrant_client/qdrant_client.py:1-50), so this now enables use_core_bm25 unconditionally. That changes Embedder._embed_sparse_text() to send models.Document for Qdrant/bm25, and Embedder.is_supported_sparse_model() to report BM25 as supported (qdrant_client/embed/embedder.py:293-327 and qdrant_client/embed/embedder.py:447-458). Remote servers that still lack core BM25 support will now be treated as capable and receive requests they cannot execute.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@qdrant_client/embed/model_embedder.py` around lines 54 - 56, Currently
use_core_bm25 is set from is_local_mode which is incorrect; change the logic in
model_embedder.py so self._is_builtin_embedder_available is determined by an
explicit server-capability check (e.g., a capability flag exposed by
QdrantClient or a call that verifies core BM25 support) instead of
is_local_mode, and pass that boolean into Embedder(use_core_bm25=...); update
any initialization that uses FastEmbedMisc.is_installed() and Embedder to
combine local availability and server capability (refer to symbols
self._is_builtin_embedder_available, FastEmbedMisc.is_installed(), and
Embedder(use_core_bm25=...)) so remote clients without core BM25 won't report
support or receive BM25 requests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 1 minor version backward compatibility is guaranteed, the current version is 1.17.0, while bm25 was introduced in 1.15.3, so backward compatibility is not required

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joein, thanks for the clarification! Given that only 1 minor version backward compatibility is guaranteed and the current version is 1.17.0, any supported server will already be at ≥ 1.16.x — well past the 1.15.3 release that introduced BM25. The simplification to not is_local_mode is therefore correct. I'll withdraw my concern.


✏️ Learnings added
Learnt from: joein
Repo: qdrant/qdrant-client PR: 1168
File: qdrant_client/embed/model_embedder.py:54-56
Timestamp: 2026-03-13T10:56:19.802Z
Learning: The qdrant-client guarantees only 1 minor version of backward compatibility. As of server version 1.17.0, BM25 (introduced in 1.15.3) is always available on any supported remote Qdrant server, so no server-capability check for BM25 is needed. The simplified logic `self._is_builtin_embedder_available = not is_local_mode` in `qdrant_client/embed/model_embedder.py` is intentional and correct.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread qdrant_client/qdrant_remote.py Outdated
Comment on lines +224 to +260
if check_compatibility:
try:
client_version = importlib.metadata.version("qdrant-client")
self.server_version = get_server_version(
self.rest_uri, self._rest_headers, self._rest_args.get("auth")
)
Thread(
target=self._check_compatibility,
args=(
self.rest_uri,
self._rest_headers,
self._rest_args.get("auth"),
self._timeout,
),
daemon=True,
).start()

if not self.server_version:
show_warning(
message="Failed to obtain server version. Unable to check client-server compatibility."
" Set check_compatibility=False to skip version check.",
category=UserWarning,
stacklevel=4,
)
elif not is_compatible(client_version, self.server_version):
show_warning(
message=f"Qdrant client version {client_version} is incompatible with server "
f"version {self.server_version}. Major versions should match and minor version difference "
"must not exceed 1. Set check_compatibility=False to skip version check.",
category=UserWarning,
stacklevel=4,
)
except Exception as er:
logging.debug(
f"Unable to get server version: {er}, server version defaults to None"
@staticmethod
def _check_compatibility(
rest_uri: str, rest_headers: dict[str, Any], auth: Any, timeout: int
) -> None:
try:
client_version = importlib.metadata.version("qdrant-client")
server_version = get_server_version(rest_uri, rest_headers, auth, timeout)

if not server_version:
show_warning(
message="Failed to obtain server version. Unable to check client-server compatibility."
" Set check_compatibility=False to skip version check.",
category=UserWarning,
stacklevel=4,
)
elif not is_compatible(client_version, server_version):
show_warning(
message=f"Qdrant client version {client_version} is incompatible with server "
f"version {server_version}. Major versions should match and minor version difference "
"must not exceed 1. Set check_compatibility=False to skip version check.",
category=UserWarning,
stacklevel=4,
)
except Exception as er:
logging.debug(f"Unable to get server version: {er}, server version defaults to None")

@coderabbitai coderabbitai Bot Mar 13, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

#!/bin/bash
# Demonstrates the two behavior changes introduced by moving warnings onto a daemon thread:
# 1) a constructor-like warnings.catch_warnings() scope can miss the warning entirely
# 2) stacklevel now resolves inside threading.py instead of the caller
python - <<'PY'
import threading, time, warnings

def emit_late():
    time.sleep(0.05)
    warnings.warn("compatibility", stacklevel=4)

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=emit_late, daemon=True)
    t.start()
print("captured during constructor scope:", len(seen))
t.join()

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=lambda: warnings.warn("compatibility", stacklevel=4))
    t.start()
    t.join()

w = seen[0]
print("warning location:", w.filename, w.lineno)
PY

Repository: qdrant/qdrant-client


Repository: qdrant/qdrant-client
Exit code: 0

stdout:

captured during constructor scope: 0
warning location: /usr/lib/python3.11/threading.py 995

stderr:

/usr/lib/python3.11/threading.py:995: UserWarning: compatibility
  self._bootstrap_inner()

Keep the threaded compatibility check observable.

The __init__ method now returns before the version check runs, and the daemon thread can be dropped at interpreter shutdown. This breaks warning observability in two ways:

  1. Warnings are missed by test scopes: warnings.catch_warnings() contexts around client construction will not capture warnings emitted from the daemon thread, since execution exits the context manager before the thread produces output. Tests like tests/embed_tests/test_builtin_inference.py:78-92 become timing-sensitive and fragile.

  2. Stack frame attribution is broken: stacklevel=4 no longer points at the user's call site because the warning originates from inside threading.Thread.run (e.g., /usr/lib/python3.11/threading.py:995), not from the caller's code.

If the non-blocking initialization is required, at minimum provide a joinable handle or explicit synchronous wait path so tests and short-lived scripts can observe warnings. Alternatively, emit the warning from the synchronous init path.

🧰 Tools
🪛 Ruff (0.15.5)

[warning] 259-259: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@qdrant_client/qdrant_remote.py` around lines 224 - 260, The compatibility
check currently runs in a daemon thread started in __init__, causing warnings to
be missed and stacklevel attribution to be wrong; update the Thread creation so
it's non-daemon and assign it to an instance attribute (e.g.,
self._compatibility_thread) so callers/tests can join it, and add an optional
flag (e.g., check_compatibility_blocking) to __init__ that, when True, calls
QdrantRemote._check_compatibility(...) synchronously instead of spawning a
thread; ensure you reference the Thread creation site in __init__ and the
_check_compatibility staticmethod when implementing these changes so tests can
observe warnings and correct stacklevel.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so stacklevel should become 5?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joein — Unfortunately, no. Changing stacklevel alone can't fix the attribution problem here.

The issue is more fundamental: when a warning is emitted from inside a Thread, the thread's call stack is entirely independent of the spawning call stack. The user's QdrantClient(...) frame simply does not exist in the thread's stack. The thread's stack looks roughly like:

warnings.warn(...)          ← frame 0
show_warning(...)           ← frame 1
_check_compatibility(...)   ← frame 2
Thread._bootstrap_inner()   ← frame 3 (threading internals)

No matter what stacklevel you choose, you'll land in threading internals (as the script output showed: threading.py:995) — or even cause an IndexError if the value is too large. You can never traverse back to the user's QdrantClient() call because that frame isn't on the thread's stack.

The only ways to get correct attribution are:

  1. Run synchronously (the original approach) — the full call stack from user code to warnings.warn is intact.
  2. Capture the original stack info before spawning the thread and format the warning message manually to include the caller's filename/lineno.

Since the PR's intent is to avoid blocking __init__, option 2 would look something like capturing inspect.stack() at __init__ time and embedding the location in the warning message string itself, rather than relying on stacklevel.


🧠 Learnings used
Learnt from: joein
Repo: qdrant/qdrant-client PR: 1168
File: qdrant_client/embed/model_embedder.py:54-56
Timestamp: 2026-03-13T10:56:19.802Z
Learning: The qdrant-client guarantees only 1 minor version of backward compatibility. As of server version 1.17.0, BM25 (introduced in 1.15.3) is always available on any supported remote Qdrant server, so no server-capability check for BM25 is needed. The simplified logic `self._is_builtin_embedder_available = not is_local_mode` in `qdrant_client/embed/model_embedder.py` is intentional and correct.

@tbung

tbung commented Mar 13, 2026

Copy link
Copy Markdown

Well, approval conditioned on fixed tests.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_async_qdrant_client.py (1)

546-553: ⚠️ Potential issue | 🟠 Major

Make test_async_auth deterministic for threaded compatibility checks

Line 549 and Line 552 assume a fixed token order, but compatibility checking now runs in a background thread, so the first two observed tokens are timing-dependent (matches the CI failure: expected token_1, got token_0).

💡 Proposed test fix (order-independent, still validates provider calls)
     client = AsyncQdrantClient(timeout=3, auth_token_provider=auth_token_provider)
-    await client.get_collections()
-
-    assert sync_token == "token_1"
-
-    await client.get_collections()
-    assert sync_token == "token_2"
+    prev = call_num
+    await client.get_collections()
+    assert call_num >= prev + 1
+
+    prev = call_num
+    await client.get_collections()
+    assert call_num >= prev + 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_async_qdrant_client.py` around lines 546 - 553, The test assumes a
deterministic token order but background-threaded compatibility checks make the
order racey; instead of asserting sync_token equals "token_1" then "token_2",
call AsyncQdrantClient.get_collections twice (using client =
AsyncQdrantClient(...) and auth_token_provider) and capture the two observed
sync_token values into a list, then assert that the set of those two values
equals the expected set {"token_1", "token_2"} (and optionally assert the
auth_token_provider was invoked twice) so the test validates the provider calls
without depending on ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tests/test_async_qdrant_client.py`:
- Around line 546-553: The test assumes a deterministic token order but
background-threaded compatibility checks make the order racey; instead of
asserting sync_token equals "token_1" then "token_2", call
AsyncQdrantClient.get_collections twice (using client = AsyncQdrantClient(...)
and auth_token_provider) and capture the two observed sync_token values into a
list, then assert that the set of those two values equals the expected set
{"token_1", "token_2"} (and optionally assert the auth_token_provider was
invoked twice) so the test validates the provider calls without depending on
ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7093d28-ca33-43f7-9e37-37c6d6400304

📥 Commits

Reviewing files that changed from the base of the PR and between 1b3a8da and b6761a3.

📒 Files selected for processing (3)
  • qdrant_client/async_qdrant_remote.py
  • qdrant_client/qdrant_remote.py
  • tests/test_async_qdrant_client.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
qdrant_client/async_qdrant_remote.py (1)

173-209: ⚠️ Potential issue | 🟠 Major

Same off-thread warning regression in the async client.

Lines 173-209 have the same behavior change as the sync path: warning capture around client construction becomes timing-sensitive, the daemon thread can be dropped at interpreter shutdown, and stacklevel=2 still cannot attribute the warning to the caller. Since this file is autogenerated, please land the fix in the generated source/template as well.

🔍 Reproduce the same Python warning/threading behavior
#!/bin/bash
python - <<'PY'
import threading
import time
import warnings

def emit_from_worker():
    time.sleep(0.05)
    warnings.warn("compatibility", stacklevel=2)

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=emit_from_worker, daemon=True)
    t.start()
print("captured during constructor-like scope:", len(seen))
t.join()

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=emit_from_worker)
    t.start()
    t.join()

warning = seen[0]
print("warning location:", warning.filename, warning.lineno)
PY

Expected result: the first capture count is 0, and the warning location is not the original caller frame.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@qdrant_client/async_qdrant_remote.py` around lines 173 - 209, The off-thread
warning issue is due to starting the compatibility check as a daemon thread and
using stacklevel=2; change the Thread invocation in the constructor to not set
daemon=True (start a non-daemon thread) so the worker isn’t killed at
interpreter shutdown, and update the show_warning calls inside
_check_compatibility to use a higher stacklevel (e.g., stacklevel=3) so the
warning correctly attributes to the caller; make the same edits in the
generated/template source for async_qdrant_remote.py and any corresponding
generated async template.
qdrant_client/qdrant_remote.py (1)

224-265: ⚠️ Potential issue | 🟠 Major

Daemonizing the compatibility check makes the warning unreliable.

Lines 224-265 now emit compatibility warnings from a daemon worker. That makes warnings.catch_warnings() around client construction timing-sensitive, lets short-lived scripts exit before the check runs, and stacklevel=2 still cannot recover the caller frame because that frame is not on the worker thread’s stack. If this warning is meant to stay user-observable, please keep a joinable or synchronous path instead of relying only on the background thread.

🔍 Reproduce the warning behavior change
#!/bin/bash
python - <<'PY'
import threading
import time
import warnings

def emit_from_worker():
    time.sleep(0.05)
    warnings.warn("compatibility", stacklevel=2)

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=emit_from_worker, daemon=True)
    t.start()
print("captured during constructor-like scope:", len(seen))
t.join()

with warnings.catch_warnings(record=True) as seen:
    warnings.simplefilter("always")
    t = threading.Thread(target=emit_from_worker)
    t.start()
    t.join()

warning = seen[0]
print("warning location:", warning.filename, warning.lineno)
PY

Expected result: the first capture count is 0, and the warning location is not the original caller frame.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@qdrant_client/qdrant_remote.py` around lines 224 - 265, The compatibility
warning is emitted from a daemon worker which makes warnings.catch_warnings and
short-lived processes unreliable; change the background check to be joinable or
synchronous instead of a daemon fire-and-forget. Replace the Thread(...)
daemon=True start with either a synchronous call to _check_compatibility(...)
when check_compatibility is requested, or create a non-daemon Thread()
(daemon=False), store it on the instance (e.g., self._compatibility_thread) so
callers can join it, and document/keep a boolean flag (e.g.,
check_compatibility_async) if you want to preserve async behavior; update the
Thread creation site and references to _check_compatibility, rest_uri,
_rest_headers, and _rest_args.get("auth") accordingly so the warning runs on the
main/joinable thread and preserves correct stacklevel.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@qdrant_client/async_qdrant_remote.py`:
- Around line 173-209: The off-thread warning issue is due to starting the
compatibility check as a daemon thread and using stacklevel=2; change the Thread
invocation in the constructor to not set daemon=True (start a non-daemon thread)
so the worker isn’t killed at interpreter shutdown, and update the show_warning
calls inside _check_compatibility to use a higher stacklevel (e.g.,
stacklevel=3) so the warning correctly attributes to the caller; make the same
edits in the generated/template source for async_qdrant_remote.py and any
corresponding generated async template.

In `@qdrant_client/qdrant_remote.py`:
- Around line 224-265: The compatibility warning is emitted from a daemon worker
which makes warnings.catch_warnings and short-lived processes unreliable; change
the background check to be joinable or synchronous instead of a daemon
fire-and-forget. Replace the Thread(...) daemon=True start with either a
synchronous call to _check_compatibility(...) when check_compatibility is
requested, or create a non-daemon Thread() (daemon=False), store it on the
instance (e.g., self._compatibility_thread) so callers can join it, and
document/keep a boolean flag (e.g., check_compatibility_async) if you want to
preserve async behavior; update the Thread creation site and references to
_check_compatibility, rest_uri, _rest_headers, and _rest_args.get("auth")
accordingly so the warning runs on the main/joinable thread and preserves
correct stacklevel.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb0c606e-2ea3-4dbc-9baa-1513f697f255

📥 Commits

Reviewing files that changed from the base of the PR and between 3d52413 and ce31ae3.

📒 Files selected for processing (2)
  • qdrant_client/async_qdrant_remote.py
  • qdrant_client/qdrant_remote.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_qdrant_client.py`:
- Line 1812: Replace the hardcoded time.sleep(0.5) waits in
tests/test_qdrant_client.py with a bounded polling loop that checks for the
actual side-effect produced by the background compatibility thread (e.g.,
presence/updated value of the auth token or whatever state the test asserts).
Implement a loop that polls the condition every 10–50ms up to a short timeout
(e.g., 2–5s), breaking when the condition is met and failing the test if the
timeout elapses; replace both occurrences of time.sleep(0.5) with this polling
logic so the test waits deterministically for the qdrant_remote compatibility
thread’s effect instead of sleeping.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1d503e47-89f1-47f0-9b7b-9a4a6cff206d

📥 Commits

Reviewing files that changed from the base of the PR and between ce31ae3 and 88b5c48.

📒 Files selected for processing (1)
  • tests/test_qdrant_client.py


# Additional sync request is sent during client init to check compatibility
client = QdrantClient(auth_token_provider=auth_token_provider)
time.sleep(0.5)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace fixed sleeps with bounded polling to avoid flaky auth-token timing assertions.

Line 1812 and Line 1832 use a hardcoded 0.5s delay to “wait” for the background compatibility thread. Since that thread is asynchronous and unsynchronized (qdrant_client/qdrant_remote.py:176-186), this can still race on slow CI and fail nondeterministically.

Proposed fix
 def test_auth_token_provider():
@@
     def auth_token_provider():
@@
         call_num += 1
         return token
+
+    def wait_for_background_auth_call(expected_min_calls: int = 1, timeout_s: float = 2.0) -> None:
+        deadline = time.time() + timeout_s
+        while call_num < expected_min_calls and time.time() < deadline:
+            time.sleep(0.01)
+        assert call_num >= expected_min_calls, "Background compatibility check did not trigger auth provider in time"
@@
     # Additional sync request is sent during client init to check compatibility
     client = QdrantClient(auth_token_provider=auth_token_provider)
-    time.sleep(0.5)
+    wait_for_background_auth_call(expected_min_calls=1)
@@
     # Additional sync http request is sent during client init to check compatibility
     client = QdrantClient(prefer_grpc=True, auth_token_provider=auth_token_provider)
-    time.sleep(0.5)
+    wait_for_background_auth_call(expected_min_calls=1)

Also applies to: 1832-1832

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_qdrant_client.py` at line 1812, Replace the hardcoded
time.sleep(0.5) waits in tests/test_qdrant_client.py with a bounded polling loop
that checks for the actual side-effect produced by the background compatibility
thread (e.g., presence/updated value of the auth token or whatever state the
test asserts). Implement a loop that polls the condition every 10–50ms up to a
short timeout (e.g., 2–5s), breaking when the condition is met and failing the
test if the timeout elapses; replace both occurrences of time.sleep(0.5) with
this polling logic so the test waits deterministically for the qdrant_remote
compatibility thread’s effect instead of sleeping.

@joein joein merged commit a810088 into dev Mar 13, 2026
12 checks passed
joein added a commit that referenced this pull request Mar 13, 2026
#1168)

* new: run server version check in a thread, don't check bm25 availability as it was introduced in 1.15.3

* fix: fix tests for auth

* fix: fix test for auth again

* fix: update stacklevel for in-thread warnings

* fix: fix auth token test for sync client

* tests: remove outdated tests
@ns-github

Copy link
Copy Markdown

Thanks all :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants