Reduce hf requests by joein · Pull Request #1092 · qdrant/qdrant-client

joein · 2025-10-23T12:51:58Z

Huggingface has recently introduced a rate limit of 1000 requests per minute, and our CI started hitting this limit.

This PR tries to mitigate the issue.
I've removed some redundant tests for query_points with models.Document, since query_points is already covered in inference tests.
Moreover, we could not utilise cached embeddings when we were testing multiprocessing, this PR introduces a common cache_dir used in both single- and multiprocess scenarios.

netlify · 2025-10-23T12:52:03Z

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Name	Link
🔨 Latest commit	`3e097a7`
🔍 Latest deploy log	https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/68fa24f1fd7f280008b22bf3
😎 Deploy Preview	https://deploy-preview-1092--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai · 2025-10-23T12:58:48Z

📝 Walkthrough

Walkthrough

This PR modifies test infrastructure across two files. In tests/embed_tests/test_local_inference.py, it introduces per-model cache handling by adding a temporary session directory (SESSION_TMP_DIR) and helper function to derive per-model cache paths. The cached_embeddings fixture is changed from session scope to module scope, and cached embedding helpers are extended to default cache_dir to per-model paths and set local_files_only based on cache existence. Test setup is reworked to propagate cache options into document and image constructors across numerous test cases. In tests/test_fastembed.py, the test_query_interface function and its associated imports (QdrantBase and compare_client_results) are removed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes involve mixed complexity across two files: fixture scope modifications, introduction of new caching helper mechanisms, and propagation of cache options through many test functions in one file (heterogeneous patterns requiring separate reasoning per test function), combined with straightforward code removal in another file. Multiple test functions require verification for correct cache option propagation.

Possibly related PRs

refactor: refactor test local inference #952: Modifies the cached_embeddings fixture in the same test file and introduces foundational cache handling that this PR extends with per-model cache logic.
new: support multimodal embeddings #908: Modifies the same test module and touches multimodal embedding test logic similar to adjustments made in this PR.

Suggested reviewers

generall
hh-space-invader

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description Check	❓ Inconclusive	The pull request contains no description provided by the author. According to the lenient evaluation criteria, a description should pass if it is related in some way to the changeset. However, an empty description provides no information and therefore cannot be assessed as related to or describing any aspect of the changeset. This absence of content prevents determination of whether relatedness exists.	Consider adding a pull request description that explains the rationale and impact of the changes. Even a brief description such as "Reduce Hugging Face requests by implementing per-model local caches in tests and removing redundant remote request tests" would provide context for reviewers and future maintainers.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The pull request title "Reduce hf requests" is directly related to the changeset. The modifications across both files implement mechanisms to reduce Hugging Face API requests: the first file adds per-model cache handling and sets `local_files_only` when caches exist to avoid remote requests, while the second file removes the `test_query_interface` test that explicitly tested remote behavior. The title is concise, specific, and clearly communicates the primary objective of the changes from a high level.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch reduce-hf-requests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

tests/embed_tests/test_local_inference.py (4)
43-47: Make tempdir cleanup robust.

rmtree can fail if already removed. Ignore errors.
 @pytest.fixture(scope="module", autouse=True)
 def cleanup():
     yield
-    shutil.rmtree(SESSION_TMP_DIR)
+    shutil.rmtree(SESSION_TMP_DIR, ignore_errors=True)
283-287: DRY: factor per‑model options builder.

You repeat cache_dir/local_files_only setup many times. Add a tiny helper and reuse.
def _model_options(name: str) -> dict:
    d = _get_cache_dir_name(name)
    opts = {"cache_dir": str(d)}
    if d.exists():
        opts["local_files_only"] = True
    return opts
# usage:
dense_options = _model_options(DENSE_MODEL_NAME)
1516-1516: Add skip reasons.

Use @pytest.mark.skip(reason="...") for clarity and CI reporting.
-@pytest.mark.skip
+@pytest.mark.skip(reason="Relies on HF network; disabled to reduce external requests.")
 ...
-@pytest.mark.skip
+@pytest.mark.skip(reason="Multimodal path depends on external weights; disabled in CI.")
Also applies to: 1671-1672

1524-1527: Normalize cache_dir to str in skipped tests.

You pass a Path into options here, while other spots cast to str. Keep consistent to avoid downstream type assumptions.
-    download_options["cache_dir"] = _get_cache_dir_name("Qdrant/bm25")
-    if download_options["cache_dir"].exists():
+    cache_dir = _get_cache_dir_name("Qdrant/bm25")
+    download_options["cache_dir"] = str(cache_dir)
+    if cache_dir.exists():
         download_options["local_files_only"] = True
Also applies to: 1592-1596

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9fb492 and 3e097a7.

📒 Files selected for processing (2)

tests/embed_tests/test_local_inference.py (23 hunks)
tests/test_fastembed.py (0 hunks)

💤 Files with no reviewable changes (1)

tests/test_fastembed.py

🧰 Additional context used

🧬 Code graph analysis (1)

tests/embed_tests/test_local_inference.py (2)

qdrant_client/http/models/models.py (3)

Document (697-708)

Image (1068-1079)

PointStruct (1884-1887)

qdrant_client/qdrant_client.py (1)

QdrantClient (27-3173)

🪛 Ruff (0.14.1)

tests/embed_tests/test_local_inference.py

53-53: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

60-60: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

67-67: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

74-74: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Redirect rules - poetic-froyo-8baba7
GitHub Check: Header rules - poetic-froyo-8baba7
GitHub Check: Pages changed - poetic-froyo-8baba7
GitHub Check: Python 3.11.x on ubuntu-latest test
GitHub Check: Python 3.10.x on ubuntu-latest test
GitHub Check: Python 3.13.x on ubuntu-latest test
GitHub Check: Python 3.12.x on ubuntu-latest test
GitHub Check: Python 3.9.x on ubuntu-latest test

coderabbitai · 2025-10-23T12:58:51Z

 @lru_cache
 def _cached_text_embedding(model_name, *args, **kwargs):
+    kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
+    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
    return TextEmbedding(model_name=model_name, *args, **kwargs)


⚠️ Potential issue | 🔴 Critical

Fix KeyError in cache plumbing and avoid star-arg after keyword (B026).

If cache_dir isn’t passed, kwargs["cache_dir"] raises KeyError in all four wrappers. Also, model_name=..., *args triggers Ruff B026. Use .get() and pass model_name positionally.

As per static analysis hints.

@lru_cache def _cached_text_embedding(model_name, *args, **kwargs): - kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name) - kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()) - return TextEmbedding(model_name=model_name, *args, **kwargs) + cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name) + kwargs["cache_dir"] = cache_dir + kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists()) + return TextEmbedding(model_name, *args, **kwargs) @lru_cache def _cached_sparse_text_embedding(model_name, *args, **kwargs): - kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name) - kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()) - return SparseTextEmbedding(model_name=model_name, *args, **kwargs) + cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name) + kwargs["cache_dir"] = cache_dir + kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists()) + return SparseTextEmbedding(model_name, *args, **kwargs) @lru_cache def _cached_late_interaction_text_embedding(model_name, *args, **kwargs): - kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name) - kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()) - return LateInteractionTextEmbedding(model_name=model_name, *args, **kwargs) + cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name) + kwargs["cache_dir"] = cache_dir + kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists()) + return LateInteractionTextEmbedding(model_name, *args, **kwargs) @lru_cache def _cached_image_embedding(model_name, *args, **kwargs): - kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name) - kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()) - return ImageEmbedding(model_name=model_name, *args, **kwargs) + cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name) + kwargs["cache_dir"] = cache_dir + kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists()) + return ImageEmbedding(model_name, *args, **kwargs)

Also applies to: 56-60, 63-67, 70-74

🧰 Tools

🪛 Ruff (0.14.1)

53-53: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

🤖 Prompt for AI Agents

In tests/embed_tests/test_local_inference.py around lines 49-53 (and similarly at 56-60, 63-67, 70-74), the wrappers access kwargs["cache_dir"] which raises KeyError when cache_dir is not provided and they pass model_name as a keyword then use *args afterwards (triggers Ruff B026); change to use kwargs.get("cache_dir") or kwargs.setdefault("cache_dir", _get_cache_dir_name(model_name)) to avoid KeyError, compute local_files_only using kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()), and call TextEmbedding with model_name as a positional argument (i.e., TextEmbedding(model_name, *args, **kwargs)) so no keyword precedes *args; apply the same fixes to all four wrapper functions at the indicated ranges.

* new: queryless score 1 * Reduce hf requests (#1092) * tests: remove redundant test * new: improve caching, reduce number of hf requests * fix: fix possible cache_dir key error

joein added 2 commits October 23, 2025 14:37

tests: remove redundant test

ac8bdb0

new: improve caching, reduce number of hf requests

3e097a7

coderabbitai Bot reviewed Oct 23, 2025

View reviewed changes

joein changed the base branch from dev to queryless-score-1 October 23, 2025 13:03

fix: fix possible cache_dir key error

60cd458

joein requested a review from tbung October 23, 2025 13:21

This was referenced Oct 23, 2025

local mode 1.16 #1086

Closed

Implement passing certificates creds to grpc channels #1077

Merged

coszio mentioned this pull request Oct 23, 2025

new: queryless score 1 #1088

Merged

tbung approved these changes Oct 24, 2025

View reviewed changes

joein merged commit a37f293 into queryless-score-1 Oct 24, 2025
10 checks passed

joein added a commit that referenced this pull request Oct 24, 2025

new: queryless score 1 (#1088)

784b27a

* new: queryless score 1 * Reduce hf requests (#1092) * tests: remove redundant test * new: improve caching, reduce number of hf requests * fix: fix possible cache_dir key error

joein added a commit that referenced this pull request Nov 14, 2025

new: queryless score 1 (#1088)

b3c4cf7

* new: queryless score 1 * Reduce hf requests (#1092) * tests: remove redundant test * new: improve caching, reduce number of hf requests * fix: fix possible cache_dir key error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce hf requests#1092

Reduce hf requests#1092
joein merged 3 commits into
queryless-score-1from
reduce-hf-requests

joein commented Oct 23, 2025 •

edited

Loading

Uh oh!

netlify Bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot commented Oct 23, 2025

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joein commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for poetic-froyo-8baba7 ready!

Uh oh!

coderabbitai Bot commented Oct 23, 2025

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joein commented Oct 23, 2025 •

edited

Loading

netlify Bot commented Oct 23, 2025 •

edited

Loading