Skip to content

Reduce hf requests#1092

Merged
joein merged 3 commits into
queryless-score-1from
reduce-hf-requests
Oct 24, 2025
Merged

Reduce hf requests#1092
joein merged 3 commits into
queryless-score-1from
reduce-hf-requests

Conversation

@joein

@joein joein commented Oct 23, 2025

Copy link
Copy Markdown
Member

Huggingface has recently introduced a rate limit of 1000 requests per minute, and our CI started hitting this limit.

This PR tries to mitigate the issue.
I've removed some redundant tests for query_points with models.Document, since query_points is already covered in inference tests.
Moreover, we could not utilise cached embeddings when we were testing multiprocessing, this PR introduces a common cache_dir used in both single- and multiprocess scenarios.

@netlify

netlify Bot commented Oct 23, 2025

Copy link
Copy Markdown

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 3e097a7
🔍 Latest deploy log https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/68fa24f1fd7f280008b22bf3
😎 Deploy Preview https://deploy-preview-1092--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai

coderabbitai Bot commented Oct 23, 2025

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR modifies test infrastructure across two files. In tests/embed_tests/test_local_inference.py, it introduces per-model cache handling by adding a temporary session directory (SESSION_TMP_DIR) and helper function to derive per-model cache paths. The cached_embeddings fixture is changed from session scope to module scope, and cached embedding helpers are extended to default cache_dir to per-model paths and set local_files_only based on cache existence. Test setup is reworked to propagate cache options into document and image constructors across numerous test cases. In tests/test_fastembed.py, the test_query_interface function and its associated imports (QdrantBase and compare_client_results) are removed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes involve mixed complexity across two files: fixture scope modifications, introduction of new caching helper mechanisms, and propagation of cache options through many test functions in one file (heterogeneous patterns requiring separate reasoning per test function), combined with straightforward code removal in another file. Multiple test functions require verification for correct cache option propagation.

Possibly related PRs

Suggested reviewers

  • generall
  • hh-space-invader

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ❓ Inconclusive The pull request contains no description provided by the author. According to the lenient evaluation criteria, a description should pass if it is related in some way to the changeset. However, an empty description provides no information and therefore cannot be assessed as related to or describing any aspect of the changeset. This absence of content prevents determination of whether relatedness exists. Consider adding a pull request description that explains the rationale and impact of the changes. Even a brief description such as "Reduce Hugging Face requests by implementing per-model local caches in tests and removing redundant remote request tests" would provide context for reviewers and future maintainers.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Reduce hf requests" is directly related to the changeset. The modifications across both files implement mechanisms to reduce Hugging Face API requests: the first file adds per-model cache handling and sets local_files_only when caches exist to avoid remote requests, while the second file removes the test_query_interface test that explicitly tested remote behavior. The title is concise, specific, and clearly communicates the primary objective of the changes from a high level.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch reduce-hf-requests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
tests/embed_tests/test_local_inference.py (4)

43-47: Make tempdir cleanup robust.

rmtree can fail if already removed. Ignore errors.

 @pytest.fixture(scope="module", autouse=True)
 def cleanup():
     yield
-    shutil.rmtree(SESSION_TMP_DIR)
+    shutil.rmtree(SESSION_TMP_DIR, ignore_errors=True)

283-287: DRY: factor per‑model options builder.

You repeat cache_dir/local_files_only setup many times. Add a tiny helper and reuse.

def _model_options(name: str) -> dict:
    d = _get_cache_dir_name(name)
    opts = {"cache_dir": str(d)}
    if d.exists():
        opts["local_files_only"] = True
    return opts
# usage:
dense_options = _model_options(DENSE_MODEL_NAME)

1516-1516: Add skip reasons.

Use @pytest.mark.skip(reason="...") for clarity and CI reporting.

-@pytest.mark.skip
+@pytest.mark.skip(reason="Relies on HF network; disabled to reduce external requests.")
 ...
-@pytest.mark.skip
+@pytest.mark.skip(reason="Multimodal path depends on external weights; disabled in CI.")

Also applies to: 1671-1672


1524-1527: Normalize cache_dir to str in skipped tests.

You pass a Path into options here, while other spots cast to str. Keep consistent to avoid downstream type assumptions.

-    download_options["cache_dir"] = _get_cache_dir_name("Qdrant/bm25")
-    if download_options["cache_dir"].exists():
+    cache_dir = _get_cache_dir_name("Qdrant/bm25")
+    download_options["cache_dir"] = str(cache_dir)
+    if cache_dir.exists():
         download_options["local_files_only"] = True

Also applies to: 1592-1596

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9fb492 and 3e097a7.

📒 Files selected for processing (2)
  • tests/embed_tests/test_local_inference.py (23 hunks)
  • tests/test_fastembed.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tests/test_fastembed.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/embed_tests/test_local_inference.py (2)
qdrant_client/http/models/models.py (3)
  • Document (697-708)
  • Image (1068-1079)
  • PointStruct (1884-1887)
qdrant_client/qdrant_client.py (1)
  • QdrantClient (27-3173)
🪛 Ruff (0.14.1)
tests/embed_tests/test_local_inference.py

53-53: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


60-60: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


67-67: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


74-74: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test

Comment on lines 49 to 53
@lru_cache
def _cached_text_embedding(model_name, *args, **kwargs):
kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
return TextEmbedding(model_name=model_name, *args, **kwargs)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix KeyError in cache plumbing and avoid star-arg after keyword (B026).

If cache_dir isn’t passed, kwargs["cache_dir"] raises KeyError in all four wrappers. Also, model_name=..., *args triggers Ruff B026. Use .get() and pass model_name positionally.

As per static analysis hints.

 @lru_cache
 def _cached_text_embedding(model_name, *args, **kwargs):
-    kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
-    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
-    return TextEmbedding(model_name=model_name, *args, **kwargs)
+    cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name)
+    kwargs["cache_dir"] = cache_dir
+    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists())
+    return TextEmbedding(model_name, *args, **kwargs)
 
 @lru_cache
 def _cached_sparse_text_embedding(model_name, *args, **kwargs):
-    kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
-    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
-    return SparseTextEmbedding(model_name=model_name, *args, **kwargs)
+    cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name)
+    kwargs["cache_dir"] = cache_dir
+    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists())
+    return SparseTextEmbedding(model_name, *args, **kwargs)
 
 @lru_cache
 def _cached_late_interaction_text_embedding(model_name, *args, **kwargs):
-    kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
-    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
-    return LateInteractionTextEmbedding(model_name=model_name, *args, **kwargs)
+    cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name)
+    kwargs["cache_dir"] = cache_dir
+    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists())
+    return LateInteractionTextEmbedding(model_name, *args, **kwargs)
 
 @lru_cache
 def _cached_image_embedding(model_name, *args, **kwargs):
-    kwargs["cache_dir"] = kwargs["cache_dir"] or _get_cache_dir_name(model_name)
-    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists())
-    return ImageEmbedding(model_name=model_name, *args, **kwargs)
+    cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name)
+    kwargs["cache_dir"] = cache_dir
+    kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists())
+    return ImageEmbedding(model_name, *args, **kwargs)

Also applies to: 56-60, 63-67, 70-74

🧰 Tools
🪛 Ruff (0.14.1)

53-53: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

🤖 Prompt for AI Agents
In tests/embed_tests/test_local_inference.py around lines 49-53 (and similarly
at 56-60, 63-67, 70-74), the wrappers access kwargs["cache_dir"] which raises
KeyError when cache_dir is not provided and they pass model_name as a keyword
then use *args afterwards (triggers Ruff B026); change to use
kwargs.get("cache_dir") or kwargs.setdefault("cache_dir",
_get_cache_dir_name(model_name)) to avoid KeyError, compute local_files_only
using kwargs.get("local_files_only", Path(kwargs["cache_dir"]).exists()), and
call TextEmbedding with model_name as a positional argument (i.e.,
TextEmbedding(model_name, *args, **kwargs)) so no keyword precedes *args; apply
the same fixes to all four wrapper functions at the indicated ranges.

@joein joein changed the base branch from dev to queryless-score-1 October 23, 2025 13:03
@joein joein merged commit a37f293 into queryless-score-1 Oct 24, 2025
10 checks passed
joein added a commit that referenced this pull request Oct 24, 2025
* new: queryless score 1

* Reduce hf requests (#1092)

* tests: remove redundant test

* new: improve caching, reduce number of hf requests

* fix: fix possible cache_dir key error
joein added a commit that referenced this pull request Nov 14, 2025
* new: queryless score 1

* Reduce hf requests (#1092)

* tests: remove redundant test

* new: improve caching, reduce number of hf requests

* fix: fix possible cache_dir key error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants