wip: builtin bm25 support by joein · Pull Request #1060 · qdrant/qdrant-client

joein · 2025-08-14T15:28:09Z

wip:

check server version
add tests

netlify · 2025-08-14T15:28:24Z

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Name	Link
🔨 Latest commit	`3875225`
🔍 Latest deploy log	https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/689f579fd2938b0008f0dc33
😎 Deploy Preview	https://deploy-preview-1060--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2025-08-14T15:32:07Z

📝 Walkthrough

Walkthrough

Initialization for sync and async clients was reworked to choose local vs remote backends before invoking base-class constructors; base init now receives explicit is_local_mode and server_version. Async and sync remote clients expose a server_version attribute. ModelEmbedder now accepts is_local_mode and server_version, chooses between a new BuiltinEmbedder and the external Embedder at init, and delegates model-capability checks to the chosen embedder. Embedder gained classmethods for model-support queries. Fastembed mixins and fast-embed-related flows were updated to construct ModelEmbedder with the new flags and to runtime-check embedder types. New tests cover builtin (BM25) inference paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

fix: init_options should not crush with unpickleable objects #993 — Reorders client initialization and base-class wiring; closely related to the same client-init rework here.
new: support multimodal embeddings #908 — Modifies Embedder/ModelEmbedder support checks and inference resolution, overlapping the embedder selection/refactor in this PR.
new: move fastembed imports and constants into a class to support cus… #994 — Changes fastembed integration and FastEmbedMisc usage, related to fastembed import/usage changes and mixin updates.

Suggested reviewers

coszio

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch builtin-bm25

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🔭 Outside diff range comments (4)

qdrant_client/async_qdrant_fastembed.py (2)

232-244: Replace assert with explicit runtime check to avoid failures under -O and give actionable error.

Using assert for control flow is brittle (Python strips asserts with -O), and will lead to AttributeError when BuiltinEmbedder lacks get_or_init_model. Raise a clear ImportError instead.

Apply this diff:
-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise ImportError(
+                "Client-side FastEmbed model initialization requires the 'fastembed' extra. "
+                "Install with `pip install qdrant-client[fastembed]` or use server-side/builtin inference paths."
+            )
252-264: Same here: avoid assert for the sparse model path.

Mirror the explicit check for the sparse initializer to prevent silent removal under -O and provide a clear message.

Apply this diff:
-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise ImportError(
+                "Client-side FastEmbed sparse model initialization requires the 'fastembed' extra. "
+                "Install with `pip install qdrant-client[fastembed]` or use server-side/builtin sparse inference."
+            )

qdrant_client/qdrant_fastembed.py (2)

242-251: Avoid assert for control-flow; raise a clear error when FastEmbed isn’t available.

Asserts are stripped with python -O and will lead to obscure AttributeError on BuiltinEmbedder. Use an explicit guard.

Apply this diff:

-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise ImportError(
+                "Client-side FastEmbed model initialization requires the 'fastembed' extra. "
+                "Install with `pip install qdrant-client[fastembed]` or use server/builtin inference."
+            )

262-270: Same fix for the sparse model initializer.

Mirror the explicit guard for the sparse model path.

Apply this diff:

-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise ImportError(
+                "Client-side FastEmbed sparse model initialization requires the 'fastembed' extra. "
+                "Install with `pip install qdrant-client[fastembed]` or use server/builtin sparse inference."
+            )

🧹 Nitpick comments (2)

qdrant_client/async_qdrant_client.py (1)

112-129: Clean local/remote split with is_local_mode propagation — looks solid.

_is_local_mode is computed correctly and propagated to the mixin via super().init(parser=..., is_local_mode=_is_local_mode, ...).

Local-path client creation (AsyncQdrantLocal) and the remote-path client creation (AsyncQdrantRemote) are mutually exclusive.

One nuance: kwargs are passed both to super() and later to AsyncQdrantRemote. If any user-provided kwargs are not expected by the HTTP client, httpx may error. If this has bitten us before, consider sanitizing or whitelisting kwargs before forwarding to AsyncQdrantRemote.

qdrant_client/qdrant_client.py (1)

116-139: Local/remote init path with is_local_mode propagation is coherent; minor nit on kwargs forwarding.

The introduction of _is_local_mode and passing it to the mixin constructor aligns with the broader local-vs-remote flow.

Inspector creation prior to super().init ensures the mixin has a parser for inference inspection.

Nit: kwargs are passed to both super() and QdrantRemote(). If any caller supplies unexpected keys intended only for high-level client construction, they may leak to the HTTP stack. Consider validating/whitelisting or documenting accepted kwargs for remote init to prevent httpx argument errors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3d29466 and 2e73c08.

📒 Files selected for processing (6)

qdrant_client/async_qdrant_client.py (2 hunks)
qdrant_client/async_qdrant_fastembed.py (4 hunks)
qdrant_client/embed/embedder.py (1 hunks)
qdrant_client/embed/model_embedder.py (6 hunks)
qdrant_client/qdrant_client.py (2 hunks)
qdrant_client/qdrant_fastembed.py (4 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (6)

qdrant_client/embed/embedder.py (1)

qdrant_client/fastembed_common.py (5)

is_supported_text_model (175-190)

is_supported_image_model (193-208)

is_supported_late_interaction_text_model (211-228)

is_supported_late_interaction_multimodal_model (231-248)

is_supported_sparse_model (251-266)

qdrant_client/async_qdrant_fastembed.py (3)

qdrant_client/embed/model_embedder.py (2)

embed (344-392)

ModelEmbedder (43-457)

qdrant_client/embed/embedder.py (2)

embed (222-267)

Embedder (29-447)

qdrant_client/embed/schema_parser.py (1)

ModelSchemaParser (29-305)

qdrant_client/embed/model_embedder.py (4)

qdrant_client/embed/embedder.py (7)

embed (222-267)

Embedder (29-447)

is_supported_text_model (390-399)

is_supported_sparse_model (438-447)

is_supported_late_interaction_text_model (414-423)

is_supported_image_model (402-411)

is_supported_late_interaction_multimodal_model (426-435)

qdrant_client/embed/schema_parser.py (1)

ModelSchemaParser (29-305)

qdrant_client/embed/embed_inspector.py (1)

InspectorEmbed (13-176)

qdrant_client/fastembed_common.py (8)

FastEmbedMisc (36-266)

import_fastembed (70-78)

is_installed (45-67)

is_supported_text_model (175-190)

is_supported_sparse_model (251-266)

is_supported_late_interaction_text_model (211-228)

is_supported_image_model (193-208)

is_supported_late_interaction_multimodal_model (231-248)

qdrant_client/qdrant_client.py (2)

qdrant_client/client_base.py (1)

QdrantBase (7-540)

qdrant_client/embed/type_inspector.py (1)

Inspector (12-149)

qdrant_client/qdrant_fastembed.py (2)

qdrant_client/embed/model_embedder.py (2)

embed (344-392)

ModelEmbedder (43-457)

qdrant_client/embed/embedder.py (2)

embed (222-267)

Embedder (29-447)

qdrant_client/async_qdrant_client.py (2)

qdrant_client/async_client_base.py (1)

AsyncQdrantBase (17-501)

qdrant_client/embed/type_inspector.py (1)

Inspector (12-149)

🪛 GitHub Actions: Integration tests

qdrant_client/embed/model_embedder.py

[error] 9-9: ModuleNotFoundError: No module named 'qdrant_client.embed.builtin_embedder'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Redirect rules - poetic-froyo-8baba7
GitHub Check: Header rules - poetic-froyo-8baba7
GitHub Check: Pages changed - poetic-froyo-8baba7

🔇 Additional comments (7)

qdrant_client/async_qdrant_fastembed.py (2)

21-21: Good addition: explicit Embedder import for runtime type checks.

This makes the isinstance guard below possible without circular imports.

43-47: Constructor now wires local-mode into ModelEmbedder — aligns with the new init flow.

Forwarding is_local_mode down to ModelEmbedder is the right place to decide between local FastEmbed and Builtin embedder backends.

qdrant_client/qdrant_fastembed.py (2)

12-12: Explicit Embedder import enables safe runtime checks.

Good preparatory change for the isinstance guards below.

37-41: Constructor now accepts is_local_mode and forwards it — correct direction.

This ensures ModelEmbedder can import FastEmbed eagerly in local mode and choose between Embedder vs BuiltinEmbedder appropriately.

qdrant_client/embed/embedder.py (1)

389-447: Convenience support-check wrappers are a nice addition.

These classmethods provide a stable surface to query support without reaching into FastEmbedMisc directly. No functional risk introduced.

qdrant_client/embed/model_embedder.py (2)

423-453: Validate InferenceObject handling for late-interaction multimodal models.

You explicitly forbid InferenceObject for late-interaction multimodal models. If that’s intentional, OK. If the intent is to accept both text and image inputs for those models, this branch will throw unexpectedly even when value is valid.

Action: Confirm the intended UX. If multimodal should be supported via InferenceObject, consider dispatching based on the actual payload type (e.g., str -> Document, bytes/array -> Image) rather than raising.

397-402: Confirm BuiltinEmbedder implements the support-check API

FastEmbedMisc and Embedder both define the is_supported_* checks, but I could not find a BuiltinEmbedder definition in this repository. model_embedder calls these methods on self.embedder, so please confirm BuiltinEmbedder exposes the same names/signatures (cls/model_name: str -> bool) or guard the calls.

Files/locations to verify:

qdrant_client/fastembed_common.py — FastEmbedMisc: is_supported_text_model (line 175), is_supported_image_model (193), is_supported_late_interaction_text_model (211), is_supported_late_interaction_multimodal_model (231), is_supported_sparse_model (251)

qdrant_client/embed/embedder.py — Embedder: is_supported_text_model (390), is_supported_image_model (402), is_supported_late_interaction_text_model (414), is_supported_late_interaction_multimodal_model (426), is_supported_sparse_model (438)

qdrant_client/embed/model_embedder.py — import BuiltinEmbedder (line 9), factory use (line 58), isinstance(self.embedder, BuiltinEmbedder) (line 111), support-check calls at ~397–402 and ~442–445 (the code snippet you highlighted)

I couldn't locate qdrant_client/embed/builtin_embedder.py in the repo — if BuiltinEmbedder is defined elsewhere (or supplied externally), verify it implements the same API; otherwise add/implement these methods or change the runtime checks to avoid attribute errors.

coderabbitai · 2025-08-14T15:46:12Z

+    def __init__(
+        self,
+        parser: Optional[ModelSchemaParser] = None,
+        is_local_mode: bool = False,
+        **kwargs: Any,
+    ):


🛠️ Refactor suggestion

Initialize embedder based on local/remote mode and avoid hard import; also track builtin embedder via a flag.

The current selection ignores is_local_mode during embedder choice and will still try to import fastembed or instantiate BuiltinEmbedder regardless. It also calls FastEmbedMisc.import_fastembed() without using the result and may raise unnecessarily. Align behavior with the intention:

Local mode: require fastembed and use the local Embedder.

Remote mode: use BuiltinEmbedder; fail fast with a clear error if the module isn’t available yet.

Track whether the chosen embedder is builtin without relying on isinstance() against a potentially missing class.

Apply this refactor:

@@ def __init__( self, parser: Optional[ModelSchemaParser] = None, is_local_mode: bool = False, **kwargs: Any, ): self._batch_accumulator: dict[str, list[INFERENCE_OBJECT_TYPES]] = {} self._embed_storage: dict[str, list[NumericVector]] = {} self._embed_inspector = InspectorEmbed(parser=parser) - if is_local_mode: - FastEmbedMisc.import_fastembed() - self.embedder = ( - Embedder(**kwargs) if FastEmbedMisc.is_installed() else BuiltinEmbedder(**kwargs) - ) + # Track whether we are using the builtin (remote) embedder without depending on its type. + self._is_builtin_embedder: bool = False + + if is_local_mode: + # Local mode should rely on local embedding backend (fastembed). + # Fail fast with a clear message if fastembed is not available. + try: + FastEmbedMisc.import_fastembed() + except ImportError as e: + raise ImportError( + "Local mode requires 'fastembed'. Install with `pip install qdrant-client[fastembed]` " + "or initialize the client with is_local_mode=False to use the builtin (server-side) embedder." + ) from e + self.embedder = Embedder(**kwargs) + self._is_builtin_embedder = False + else: + # Remote mode should use the builtin embedder when available. + try: + from qdrant_client.embed.builtin_embedder import BuiltinEmbedder # type: ignore + except ModuleNotFoundError as e: + raise ModuleNotFoundError( + "qdrant_client.embed.builtin_embedder is missing. " + "Add the module to this PR or fix the import path." + ) from e + self.embedder = BuiltinEmbedder(**kwargs) + self._is_builtin_embedder = True

Benefits:

Unblocks import-time failure.

Makes the local vs remote selection explicit and predictable.

Removes reliance on a missing symbol for isinstance checks later.

Also applies to: 55-59

🤖 Prompt for AI Agents

In qdrant_client/embed/model_embedder.py around lines 46-51 and 55-59, the constructor currently ignores is_local_mode, unconditionally attempts to import/instantiate fastembed and BuiltinEmbedder, and later relies on isinstance() against a class that may not be imported; change it so that when is_local_mode is True you require and use the local fastembed Embedder (import fastembed lazily and raise a clear error if missing), and when is_local_mode is False you instantiate BuiltinEmbedder without importing fastembed; set a boolean flag self._is_builtin_embedder = True/False at construction time rather than using isinstance() later; remove the unused FastEmbedMisc.import_fastembed() call or use its return only when in local mode, and ensure any import errors are raised with explicit messages indicating which mode requires which module.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (2)

qdrant_client/qdrant_fastembed.py (2)

246-254: Avoid bare asserts; raise actionable error when fastembed backend is required

Asserting the concrete Embedder type will raise an AssertionError with little guidance if the builtin embedder path is active. Prefer a clear exception guiding users to enable local inference or install fastembed.

Apply:

-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise RuntimeError(
+                "This operation requires the fastembed backend. "
+                "Enable local inference or install the fastembed extras: "
+                "`pip install 'qdrant-client[fastembed]'`."
+            )

266-274: Mirror the clearer error for sparse-model init

Same rationale as above for _get_or_init_sparse_model.

-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise RuntimeError(
+                "Sparse model initialization requires the fastembed backend. "
+                "Enable local inference or install the fastembed extras: "
+                "`pip install 'qdrant-client[fastembed]'`."
+            )

♻️ Duplicate comments (1)

qdrant_client/embed/model_embedder.py (1)
55-60: Embedder selection ignores is_local_mode/server_version — selects by fastembed presence only

As written, the chosen embedder depends solely on FastEmbedMisc.is_installed(). This breaks the intended behavior:

Remote + server supports builtin (>= 1.15.3): should choose BuiltinEmbedder even if fastembed is installed.

Local mode: must require fastembed and use Embedder.

Remote + older server: fallback to fastembed if available, otherwise error.

Fix the decision tree to honor mode/server capability:
-        if is_local_mode or not self._builtin_embedder_supported(server_version):
-            FastEmbedMisc().is_installed()
-            FastEmbedMisc.import_fastembed()
-
-        self.embedder = Embedder() if FastEmbedMisc.is_installed() else BuiltinEmbedder()
+        if is_local_mode:
+            # Local mode requires fastembed
+            FastEmbedMisc.import_fastembed()
+            self.embedder = Embedder()
+        else:
+            if self._builtin_embedder_supported(server_version):
+                # Prefer builtin embedder on supported remote servers
+                self.embedder = BuiltinEmbedder()
+            else:
+                # Older remote servers: fall back to fastembed if available
+                FastEmbedMisc.import_fastembed()
+                self.embedder = Embedder()
Optional: track self._is_builtin_embedder: bool for downstream conditionals instead of isinstance(...).

🧹 Nitpick comments (10)

qdrant_client/qdrant_remote.py (1)
209-231: Persisting server_version is good; add type annotation and avoid re-fetching client_version.

Storing self.server_version is useful and aligns with the rest of the PR. Recommend adding a class-level type annotation for clarity and tooling.

Minor nit: you already computed client_version earlier (Line 146). Re-fetching it on Line 212 is redundant.

Apply this diff to drop the duplicate client_version fetch:
-                client_version = importlib.metadata.version("qdrant-client")
Additionally, consider adding this class-level annotation (outside the selected range):
# directly under DEFAULT_GRPC_TIMEOUT
server_version: Optional[str]
qdrant_client/async_qdrant_remote.py (1)
174-197: Expose server_version: looks good; add type annotation and remove duplicate client_version lookup.

Keeping self.server_version on the instance is a good improvement and consistent with the sync client.

You already have client_version (Line 130). Re-fetching it on Line 177 is unnecessary.

Apply this diff to avoid redundant work:
-                client_version = importlib.metadata.version("qdrant-client")
Also consider a class-level type annotation (outside the selected range):
# directly under DEFAULT_GRPC_TIMEOUT
server_version: Optional[str]
qdrant_client/async_qdrant_fastembed.py (2)
246-254: Prefer explicit, informative guard over assert for runtime type check.

Asserts can be disabled with -O and yield a cryptic message. Raise a clear exception if the embedder is not FastEmbed-backed.
-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise RuntimeError(
+                "FastEmbed path invoked but ModelEmbedder is not using Embedder. "
+                "This likely indicates builtin/server-side embedding is selected. "
+                "Avoid calling FastEmbed-specific APIs in this mode."
+            )
266-274: Same here: replace assert with explicit error for sparse model path.
-        assert isinstance(self._model_embedder.embedder, Embedder)
+        if not isinstance(self._model_embedder.embedder, Embedder):
+            raise RuntimeError(
+                "FastEmbed sparse path invoked but ModelEmbedder is not using Embedder. "
+                "This likely indicates builtin/server-side embedding is selected."
+            )
qdrant_client/qdrant_client.py (1)

116-161: Backend-first init and server_version propagation: LGTM

Selecting local/remote before base init, capturing server_version for the remote path, and wiring the inspector/parser into super().init keeps responsibilities well-scoped. The cloud_inference guard for local mode remains correct.

Minor: consider caching is_local = isinstance(self._client, QdrantLocal) once to reuse on Lines 149 and 160.
qdrant_client/embed/builtin_embedder.py (2)
13-35: Return type doesn’t match implementation; make it explicit Documents list

The method returns a list of models.Document, but it’s annotated as NumericVector. This misleads type checkers and readers.
-    ) -> NumericVector:
+    ) -> list[models.Document]:
If you prefer to keep compatibility with the external embedder interface, loosen to Iterable[Any] or define a shared alias encompassing both numeric vectors and Document outputs.

85-95: Case-insensitive membership can avoid per-call list construction

Converting the supported models to lowercase on every call builds a new list each time. Precompute a frozenset for O(1) membership and no allocations.
-class BuiltinEmbedder:
-    _SUPPORTED_MODELS = ("Qdrant/Bm25",)
+class BuiltinEmbedder:
+    _SUPPORTED_MODELS = ("Qdrant/Bm25",)
+    _SUPPORTED_MODELS_CI = frozenset(m.lower() for m in _SUPPORTED_MODELS)
@@
-        return model_name.lower() in [model.lower() for model in cls._SUPPORTED_MODELS]
+        return model_name.lower() in cls._SUPPORTED_MODELS_CI
qdrant_client/embed/model_embedder.py (3)
55-58: Avoid instantiating FastEmbedMisc just to call class methods

Use the class methods directly.
-            FastEmbedMisc().is_installed()
-            FastEmbedMisc.import_fastembed()
+            FastEmbedMisc.is_installed()
+            FastEmbedMisc.import_fastembed()
61-78: Simplify version check logic

Minor readability improvement; return the comparison result directly.
-        try:
-            major, minor, patch = server_version.split(".")
-            patch = patch.split("-")[0]
-
-            if (int(major), int(minor), int(patch)) >= (1, 15, 3):
-                return True
-
-            return False
-        except Exception:
-            return True
+        try:
+            major, minor, patch = server_version.split(".")
+            patch = patch.split("-")[0]
+            return (int(major), int(minor), int(patch)) >= (1, 15, 3)
+        except Exception:
+            return True
129-134: Coupling on BuiltinEmbedder type check; consider a backend flag

This branch avoids multiprocess embedding for the builtin path by checking isinstance(self.embedder, BuiltinEmbedder). It’s workable, but an internal boolean (e.g., self._is_builtin_embedder) set at construction is more explicit and avoids type coupling.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2e73c08 and ad6cc11.

📒 Files selected for processing (9)

qdrant_client/async_qdrant_client.py (2 hunks)
qdrant_client/async_qdrant_fastembed.py (4 hunks)
qdrant_client/async_qdrant_remote.py (1 hunks)
qdrant_client/embed/builtin_embedder.py (1 hunks)
qdrant_client/embed/embedder.py (2 hunks)
qdrant_client/embed/model_embedder.py (6 hunks)
qdrant_client/qdrant_client.py (3 hunks)
qdrant_client/qdrant_fastembed.py (4 hunks)
qdrant_client/qdrant_remote.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

qdrant_client/embed/embedder.py

🧰 Additional context used

🧬 Code Graph Analysis (8)

qdrant_client/async_qdrant_remote.py (2)

qdrant_client/common/version_check.py (2)

get_server_version (12-28)

is_compatible (43-65)

qdrant_client/common/client_warnings.py (1)

show_warning (7-8)

qdrant_client/qdrant_remote.py (2)

qdrant_client/common/version_check.py (2)

get_server_version (12-28)

is_compatible (43-65)

qdrant_client/common/client_warnings.py (1)

show_warning (7-8)

qdrant_client/embed/builtin_embedder.py (2)

qdrant_client/embed/model_embedder.py (1)

embed (363-411)

qdrant_client/embed/embedder.py (6)

embed (222-267)

is_supported_sparse_model (438-447)

is_supported_text_model (390-399)

is_supported_image_model (402-411)

is_supported_late_interaction_text_model (414-423)

is_supported_late_interaction_multimodal_model (426-435)

qdrant_client/qdrant_client.py (3)

qdrant_client/client_base.py (1)

QdrantBase (7-540)

qdrant_client/embed/type_inspector.py (1)

Inspector (12-149)

qdrant_client/local/qdrant_local.py (1)

QdrantLocal (38-1260)

qdrant_client/embed/model_embedder.py (3)

qdrant_client/embed/builtin_embedder.py (7)

embed (13-34)

BuiltinEmbedder (7-94)

is_supported_text_model (37-46)

is_supported_sparse_model (85-94)

is_supported_late_interaction_text_model (61-70)

is_supported_image_model (49-58)

is_supported_late_interaction_multimodal_model (73-82)

qdrant_client/embed/embedder.py (7)

embed (222-267)

Embedder (29-447)

is_supported_text_model (390-399)

is_supported_sparse_model (438-447)

is_supported_late_interaction_text_model (414-423)

is_supported_image_model (402-411)

is_supported_late_interaction_multimodal_model (426-435)

qdrant_client/fastembed_common.py (8)

FastEmbedMisc (36-266)

is_installed (45-67)

import_fastembed (70-78)

is_supported_text_model (175-190)

is_supported_sparse_model (251-266)

is_supported_late_interaction_text_model (211-228)

is_supported_image_model (193-208)

is_supported_late_interaction_multimodal_model (231-248)

qdrant_client/async_qdrant_fastembed.py (4)

qdrant_client/embed/model_embedder.py (2)

embed (363-411)

ModelEmbedder (43-476)

qdrant_client/embed/embedder.py (2)

embed (222-267)

Embedder (29-447)

qdrant_client/embed/schema_parser.py (1)

ModelSchemaParser (29-305)

qdrant_client/fastembed_common.py (2)

FastEmbedMisc (36-266)

is_installed (45-67)

qdrant_client/async_qdrant_client.py (3)

qdrant_client/async_client_base.py (1)

AsyncQdrantBase (17-501)

qdrant_client/local/async_qdrant_local.py (1)

AsyncQdrantLocal (38-1164)

qdrant_client/embed/type_inspector.py (1)

Inspector (12-149)

qdrant_client/qdrant_fastembed.py (4)

qdrant_client/embed/model_embedder.py (2)

embed (363-411)

ModelEmbedder (43-476)

qdrant_client/embed/builtin_embedder.py (1)

embed (13-34)

qdrant_client/embed/embedder.py (2)

embed (222-267)

Embedder (29-447)

qdrant_client/fastembed_common.py (2)

FastEmbedMisc (36-266)

is_installed (45-67)

🪛 Ruff (0.12.2)

qdrant_client/embed/model_embedder.py

73-76: Return the condition (int(major), int(minor), int(patch)) >= (1, 15, 3) directly

Replace with return (int(major), int(minor), int(patch)) >= (1, 15, 3)

(SIM103)

🪛 GitHub Actions: Integration tests

qdrant_client/async_qdrant_fastembed.py

[error] 46-51: Generated async_qdrant_fastembed.py is not consistent with file in this repository, see diff above.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Redirect rules - poetic-froyo-8baba7
GitHub Check: Header rules - poetic-froyo-8baba7
GitHub Check: Pages changed - poetic-froyo-8baba7

🔇 Additional comments (6)

qdrant_client/async_qdrant_fastembed.py (3)

21-21: Importing Embedder is appropriate here.

The import is used by the new runtime type assertions. No concerns.

43-53: Autogenerated file mismatch & unsafe init chaining — manual verification required

The autogenerated file diverges from the generator and the new init drops **kwargs and does not forward is_local_mode/server_version to the base class — this risks breaking callers and inconsistent state. Move changes into the generator template and re-run codegen; do not keep one-off edits in generated files.

Files / actions:

qdrant_client/async_qdrant_fastembed.py — init (approx. lines 43–53): restore safe init chaining and keep kwargs forwarding.

Update the generator/template used by tools/generate_async_client.sh and re-run codegen, then commit generated output.

Suggested patch to restore safe chaining:
-    def __init__(
-        self, parser: ModelSchemaParser, is_local_mode: bool, server_version: Optional[str]
-    ):
+    def __init__(
+        self,
+        parser: ModelSchemaParser,
+        is_local_mode: bool = False,
+        server_version: Optional[str] = None,
+        **kwargs: Any,
+    ):
         self._embedding_model_name: Optional[str] = None
         self._sparse_embedding_model_name: Optional[str] = None
         self._model_embedder = ModelEmbedder(
             parser=parser, is_local_mode=is_local_mode, server_version=server_version
         )
         self.__class__._FASTEMBED_INSTALLED = FastEmbedMisc.is_installed()
-        super().__init__()
+        # forward to base; keep kwargs for future-compat
+        super().__init__(is_local_mode=is_local_mode, server_version=server_version, **kwargs)
Please verify the base-class constructors accept these forwarded parameters. Run locally and paste results if uncertain:
#!/bin/bash
set -euo pipefail
echo "Find AsyncQdrantBase definition:"
rg -n --hidden -S 'class\s+AsyncQdrantBase\b' || true
rg -n -A2 -B2 -S 'class\s+AsyncQdrantBase\b.*?def\s+__init__\s*\(' || true

echo "Find QdrantBase definition:"
rg -n --hidden -S 'class\s+QdrantBase\b' || true
rg -n -A2 -B2 -S 'class\s+QdrantBase\b.*?def\s+__init__\s*\(' || true
If those defs are not found, or signatures differ, adjust the forwarding accordingly and ensure generator templates are updated.

43-53: No regeneration needed — async_qdrant_fastembed.py is up-to-date

Quick verification:

qdrant_client/async_qdrant_fastembed.py contains the autogenerated header and the init snippet (lines 43–52) matches the generated output.

tools/generate_async_client.sh invokes the fastembed generator and moves async_qdrant_fastembed.py into place.

tools/async_client_generator/fastembed_generator.py writes async_qdrant_fastembed.py.

git status / git diff reported no uncommitted changes.

Conclusion: The CI claim that this file is inconsistent with generated output appears incorrect. No regeneration required; ignore/close the original comment.

Likely an incorrect or invalid review comment.
qdrant_client/async_qdrant_client.py (1)

112-152: Verified: mixin signature and remote server_version present

Confirmed both checks — no changes required.

qdrant_client/async_qdrant_fastembed.py — init(self, parser: ModelSchemaParser, is_local_mode: bool, server_version: Optional[str]) (calls ModelEmbedder with those args)

qdrant_client/async_qdrant_remote.py — assigns self.server_version in init (uses get_server_version)

qdrant_client/async_qdrant_client.py — calls super().init(parser=..., is_local_mode=..., server_version=server_version) and is consistent with the mixin/base

qdrant_client/qdrant_fastembed.py (1)

37-47: Init wiring looks correct; ModelEmbedder constructed with mode/version

Passing parser/is_local_mode/server_version into ModelEmbedder and keeping QdrantBase super init clean is the right direction. No issues here.

qdrant_client/embed/model_embedder.py (1)

416-421: Good: model support checks flow through the selected backend

Using self.embedder.is_supported_* centralizes capability decisions per backend and keeps the logic consistent for both builtin and fastembed paths.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (4)

qdrant_client/embed/model_embedder.py (4)

9-9: Missing module import breaks the build.

The BuiltinEmbedder module is not present in the repository, causing a ModuleNotFoundError at import time. This needs to be fixed before the PR can be merged.

Until the BuiltinEmbedder module is added, defer the import to runtime:

-from qdrant_client.embed.builtin_embedder import BuiltinEmbedder
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from qdrant_client.embed.builtin_embedder import BuiltinEmbedder

Then import it conditionally at runtime where needed (e.g., in __init__):

if not is_local_mode:
    from qdrant_client.embed.builtin_embedder import BuiltinEmbedder
    self.embedder = BuiltinEmbedder(**kwargs)

143-147: Use the _is_builtin_embedder_available flag instead of isinstance check.

The isinstance check will fail if BuiltinEmbedder cannot be imported. Use the flag computed during initialization.

         if (
-            isinstance(self.embedder, BuiltinEmbedder)
+            self._is_builtin_embedder_available and not FastEmbedMisc.is_installed()
             or parallel is None
             or parallel == 1
             or is_small
         ):

439-445: Use _is_builtin_embedder_available flag instead of isinstance check.

The isinstance check will fail if BuiltinEmbedder cannot be imported.

-                if isinstance(self.embedder, BuiltinEmbedder):
+                if self._is_builtin_embedder_available and not FastEmbedMisc.is_installed():
                     raise ValueError(
                         f"{model} is not among supported models. "
                         f"Have you forgotten to set `cloud_inference` or install `fastembed` for local inference?"
                     )
                 else:
                     raise ValueError(f"{model} is not among supported models")

56-61: Embedder selection logic doesn't properly use is_local_mode and builtin availability check.

The current implementation checks FastEmbedMisc.is_installed() instead of using the computed _is_builtin_embedder_available flag and is_local_mode parameter.

-        self._is_builtin_embedder_available = self._check_builtin_embedder_availability(
-            is_local_mode, server_version
-        )
-        self.embedder = (
-            Embedder(**kwargs) if FastEmbedMisc.is_installed() else BuiltinEmbedder(**kwargs)
-        )
+        self._is_builtin_embedder_available = self._check_builtin_embedder_availability(
+            is_local_mode, server_version
+        )
+        
+        if is_local_mode or not self._is_builtin_embedder_available:
+            # Local mode or old server version requires fastembed
+            if not FastEmbedMisc.is_installed():
+                raise ImportError(
+                    "Local mode or server version < 1.15.3 requires 'fastembed'. "
+                    "Install with `pip install qdrant-client[fastembed]`"
+                )
+            self.embedder = Embedder(**kwargs)
+        else:
+            # Remote mode with supported server version uses builtin embedder
+            from qdrant_client.embed.builtin_embedder import BuiltinEmbedder
+            self.embedder = BuiltinEmbedder(**kwargs)

🧹 Nitpick comments (8)

tests/embed_tests/test_builtin_inference.py (4)

13-28: Consider adding error handling and return type annotation.

The prepare_collection function should handle potential errors and have a proper return type annotation for better code maintainability.

 def prepare_collection(
     client: QdrantClient,
     collection_name: str,
     vectors_config: Optional[dict[str, Any]] = None,
     sparse_vectors_config: Optional[dict[str, Any]] = None,
-) -> None:
+) -> None:
     if client.collection_exists(collection_name):
         client.delete_collection(collection_name)
     config = (
         {DEFAULT_VECTOR_NAME: models.SparseVectorParams(modifier=models.Modifier.IDF)}
         if sparse_vectors_config is None
         else sparse_vectors_config
     )
-    client.create_collection(
-        collection_name, vectors_config=vectors_config or {}, sparse_vectors_config=config
-    )
+    try:
+        client.create_collection(
+            collection_name, vectors_config=vectors_config or {}, sparse_vectors_config=config
+        )
+    except Exception as e:
+        raise RuntimeError(f"Failed to create collection '{collection_name}': {e}") from e

78-148: Consider extracting common test setup logic to reduce duplication.

The test function has repetitive patterns for creating clients and preparing collections. Consider extracting a helper method to reduce code duplication.

Add a helper method at the module level:

def create_and_prepare_client(monkeypatch=None, server_version=None, use_memory=False):
    """Helper to create and prepare a client with collection."""
    if monkeypatch and server_version is not None:
        monkeypatch.setattr(
            "qdrant_client.qdrant_remote.get_server_version", 
            lambda *args, **kwargs: server_version
        )
    
    client = QdrantClient(":memory:") if use_memory else QdrantClient()
    prepare_collection(client, COLLECTION_NAME)
    return client

Then simplify the test:

 def test_bm25_inference_server_version(monkeypatch):
     server_version = "1.11.0"
 
-    def patched_get_server_version(*args, **kwargs):
-        return server_version
-
-    monkeypatch.setattr(
-        "qdrant_client.qdrant_remote.get_server_version", patched_get_server_version
-    )
-    remote_client = QdrantClient()
-    prepare_collection(remote_client, COLLECTION_NAME)
+    remote_client = create_and_prepare_client(monkeypatch, server_version)

120-121: Inconsistent lambda function style.

For consistency, use the same style for lambda functions throughout the test.

-    monkeypatch.setattr("qdrant_client.qdrant_remote.get_server_version", lambda: server_version)
+    monkeypatch.setattr("qdrant_client.qdrant_remote.get_server_version", lambda *args, **kwargs: server_version)

136-137: Inconsistent lambda function style.

For consistency, use the same style for lambda functions throughout the test.

-    monkeypatch.setattr("qdrant_client.qdrant_remote.get_server_version", lambda: server_version)
+    monkeypatch.setattr("qdrant_client.qdrant_remote.get_server_version", lambda *args, **kwargs: server_version)

qdrant_client/embed/model_embedder.py (4)

70-74: Simplify conditional check and improve comment clarity.

The condition can be simplified and the comment can be more concise.

-        if (
-            server_version is None
-        ):  # failed to detect server version, it might happen due to security or network
-            # problems even on supported server versions, so we are not blocking usage of BuiltinEmbedder.
+        # If server version detection fails, assume builtin embedder is available
+        if server_version is None:
             return True

80-83: Simplify return statement.

As suggested by static analysis, directly return the condition.

-            if (int(major), int(minor), int(patch)) >= (1, 15, 3):
-                return True
-
-            return False
+            return (int(major), int(minor), int(patch)) >= (1, 15, 3)

138-140: Combine nested if statements for better readability.

As suggested by static analysis, combine the conditions.

-        if isinstance(raw_models, list):
-            if len(raw_models) < batch_size:
-                is_small = True
+        if isinstance(raw_models, list) and len(raw_models) < batch_size:
+            is_small = True

104-106: Consider consolidating repeated fastembed import checks.

The same check and import pattern is repeated in three methods. Consider extracting this to a helper method.

Add a helper method:

def _ensure_fastembed_if_needed(self) -> None:
    """Ensure fastembed is available when not using builtin embedder."""
    if not self._is_builtin_embedder_available:
        FastEmbedMisc.import_fastembed()  # fail fast if fastembed is required

Then use it in the three methods:

-        if not self._is_builtin_embedder_available:
-            FastEmbedMisc.import_fastembed()  # fail fast if fastembed is required
+        self._ensure_fastembed_if_needed()

Also applies to: 133-135, 188-190

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b9b4d5e and 3875225.

📒 Files selected for processing (3)

qdrant_client/embed/embedder.py (1 hunks)
qdrant_client/embed/model_embedder.py (8 hunks)
tests/embed_tests/test_builtin_inference.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

qdrant_client/embed/embedder.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

tests/embed_tests/test_builtin_inference.py (1)

qdrant_client/fastembed_common.py (2)

FastEmbedMisc (36-266)

is_installed (45-67)

qdrant_client/embed/model_embedder.py (5)

qdrant_client/embed/embedder.py (7)

embed (222-267)

Embedder (29-447)

is_supported_text_model (390-399)

is_supported_sparse_model (438-447)

is_supported_late_interaction_text_model (414-423)

is_supported_image_model (402-411)

is_supported_late_interaction_multimodal_model (426-435)

qdrant_client/embed/builtin_embedder.py (7)

embed (13-34)

BuiltinEmbedder (7-94)

is_supported_text_model (37-46)

is_supported_sparse_model (85-94)

is_supported_late_interaction_text_model (61-70)

is_supported_image_model (49-58)

is_supported_late_interaction_multimodal_model (73-82)

qdrant_client/embed/schema_parser.py (1)

ModelSchemaParser (29-305)

qdrant_client/embed/embed_inspector.py (1)

InspectorEmbed (13-176)

qdrant_client/fastembed_common.py (8)

FastEmbedMisc (36-266)

is_installed (45-67)

import_fastembed (70-78)

is_supported_text_model (175-190)

is_supported_sparse_model (251-266)

is_supported_late_interaction_text_model (211-228)

is_supported_image_model (193-208)

is_supported_late_interaction_multimodal_model (231-248)

🪛 Ruff (0.12.2)

qdrant_client/embed/model_embedder.py

80-83: Return the condition (int(major), int(minor), int(patch)) >= (1, 15, 3) directly

Replace with return (int(major), int(minor), int(patch)) >= (1, 15, 3)

(SIM103)

138-139: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Redirect rules - poetic-froyo-8baba7
GitHub Check: Header rules - poetic-froyo-8baba7
GitHub Check: Pages changed - poetic-froyo-8baba7
GitHub Check: Python 3.12.x on ubuntu-latest test
GitHub Check: Python 3.11.x on ubuntu-latest test
GitHub Check: Python 3.13.x on ubuntu-latest test
GitHub Check: Python 3.9.x on ubuntu-latest test
GitHub Check: Python 3.10.x on ubuntu-latest test

🔇 Additional comments (2)

tests/embed_tests/test_builtin_inference.py (2)

49-61: Good practice: Using FastEmbedMisc.IS_INSTALLED directly instead of calling is_installed().

The comment explains the reasoning well - directly checking IS_INSTALLED avoids potential side effects from the is_installed() method that could modify the flag and conceal bugs.

151-178: LGTM! Test correctly validates unsupported model behavior.

The test properly validates that mixing unsupported models (all-minilm) with supported builtin models (bm25) raises a ValueError when builtin inference is not installed.

* wip: builtin bm25 support * remove debug print * new: add builtin embedder, add server version check for inference * fix: regen async * fix: fix fastembed import * fix: regen async * tests: add tests for builtin inference * fix: fix fastembed checks, improve exceptions

wip: builtin bm25 support

599d177

remove debug print

2e73c08

coderabbitai Bot reviewed Aug 14, 2025

View reviewed changes

joein added 3 commits August 15, 2025 14:56

new: add builtin embedder, add server version check for inference

957faf1

fix: regen async

3a587d0

fix: fix fastembed import

ad6cc11

coderabbitai Bot reviewed Aug 15, 2025

View reviewed changes

joein added 3 commits August 15, 2025 15:55

fix: regen async

b9b4d5e

tests: add tests for builtin inference

ff6d74a

fix: fix fastembed checks, improve exceptions

3875225

coderabbitai Bot reviewed Aug 15, 2025

View reviewed changes

joein requested a review from tbung August 19, 2025 10:44

tbung reviewed Aug 21, 2025

View reviewed changes

Comment thread qdrant_client/embed/builtin_embedder.py

Comment thread qdrant_client/embed/model_embedder.py

tbung approved these changes Sep 15, 2025

View reviewed changes

joein merged commit 200020f into dev Sep 15, 2025
14 checks passed

coderabbitai Bot mentioned this pull request Sep 15, 2025

new: remove vectors_count, update http and grpc models #1069

Merged

daradib mentioned this pull request Oct 2, 2025

[Feature Request]: Qdrant BM25 native support run-llama/llama_index#20001

Open

coderabbitai Bot mentioned this pull request Nov 14, 2025

new: expose cluster operations #1114

Merged

coderabbitai Bot mentioned this pull request Dec 2, 2025

Drop python3.9 #1110

Merged

This was referenced Mar 12, 2026

deprecate: completely replace fastembed bm25 with qdrant core bm25 in hosted qdrant #1166

Merged

new: run server version check in a thread, don't check bm25 availabil… #1168

Merged

feat: add retry/resilience middleware with exponential backoff #1171

Open

Conversation

joein commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Uh oh!

coderabbitai Bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joein commented Aug 14, 2025 •

edited

Loading

netlify Bot commented Aug 14, 2025 •

edited

Loading

coderabbitai Bot commented Aug 14, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)