Skip to content

new: queryless score 1#1088

Merged
joein merged 2 commits into
devfrom
queryless-score-1
Oct 24, 2025
Merged

new: queryless score 1#1088
joein merged 2 commits into
devfrom
queryless-score-1

Conversation

@joein

@joein joein commented Oct 17, 2025

Copy link
Copy Markdown
Member

@netlify

netlify Bot commented Oct 17, 2025

Copy link
Copy Markdown

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit a37f293
🔍 Latest deploy log https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/68fb61a0bd6d1200082a557d
😎 Deploy Preview https://deploy-preview-1088--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai

coderabbitai Bot commented Oct 17, 2025

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR makes three sets of changes:

  • In qdrant_client/local/local_collection.py, it updates two internal ScoredPoint constructions to use score=1.0 instead of 0 (in _sample_randomly and record_to_scored_point), with no signature or control-flow changes.
  • In tests/embed_tests/test_local_inference.py, it adds per-model caching and temporary session directory scaffolding, introduces _get_cache_dir_name and a module-level SESSION_TMP_DIR, changes several cached embedding helpers to accept and inject cache_dir and local_files_only, adjusts fixture scope (cached_embeddings from session to module), propagates per-model options into document/image creation across tests, and adds cleanup and conditional skips.
  • In tests/test_fastembed.py, it removes the test_query_interface test and related imports.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Rationale: small, homogeneous constant edits in production code combined with broader, heterogeneous test-suite changes (new helpers, fixture scope changes, propagation of options across many test sites, and test removal) require moderate review to verify test behavior, cleanup, and that caching/local_files_only logic is correctly applied.

Possibly related PRs

Suggested reviewers

  • tbung
  • generall
  • hh-space-invader

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ❓ Inconclusive The PR description consists only of a link to an upstream Qdrant repository PR (qdrant/qdrant#7347) with no additional context or explanation. While the link provides a reference to related upstream work, it does not convey any meaningful information about what the changeset actually does or why it was made. This constitutes a very vague and generic description that relies entirely on the reader visiting an external link to understand the intent. Although the description is not completely off-topic, it fails to provide any substantive information about the changes. Consider updating the PR description to include a brief summary of the changes, such as explaining that this PR implements the queryless score change from the upstream Qdrant PR and includes necessary supporting changes to the test infrastructure. The description should be self-contained rather than relying solely on an external link.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The title "queryless score 1" directly refers to the primary code change in the PR, which updates score values from 0 to 1.0 in the local_collection.py file for queryless operations (_sample_randomly and record_to_scored_point functions). The title is specific and clearly conveys that this change involves setting a queryless score to 1. While the PR also includes test infrastructure changes (caching scaffolding in test_local_inference.py and test removal in test_fastembed.py), these appear to be supporting or related changes to the core score update. The title captures the essence of the main change without being vague or misleading.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch queryless-score-1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@joein joein requested a review from coszio October 23, 2025 13:25
@joein joein mentioned this pull request Oct 23, 2025
5 tasks

@coszio coszio left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for porting the change 🙌, I see green tests in #1092

* tests: remove redundant test

* new: improve caching, reduce number of hf requests

* fix: fix possible cache_dir key error

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/embed_tests/test_local_inference.py (1)

49-78: Consider refactoring to avoid star-arg unpacking after keyword arguments.

The static analysis tool flags the pattern where *args appears after modifying **kwargs (lines 54, 62, 70, 78). While functional, this pattern is discouraged because it can lead to confusion about argument precedence.

Consider refactoring to pass positional args before setting kwargs, or avoid mixing *args with modified **kwargs:

 @lru_cache
 def _cached_text_embedding(model_name, *args, **kwargs):
     cache_dir = kwargs.get("cache_dir") or _get_cache_dir_name(model_name)
     kwargs["cache_dir"] = cache_dir
     kwargs["local_files_only"] = kwargs.get("local_files_only", Path(cache_dir).exists())
-    return TextEmbedding(model_name=model_name, *args, **kwargs)
+    return TextEmbedding(model_name=model_name, **kwargs)

Apply similar changes to the other cached embedding functions if *args is not actually used.

Based on static analysis hints.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d1a369c and a37f293.

📒 Files selected for processing (2)
  • tests/embed_tests/test_local_inference.py (23 hunks)
  • tests/test_fastembed.py (0 hunks)
💤 Files with no reviewable changes (1)
  • tests/test_fastembed.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/embed_tests/test_local_inference.py (2)
qdrant_client/http/models/models.py (3)
  • Document (697-708)
  • Image (1068-1079)
  • PointStruct (1884-1887)
qdrant_client/qdrant_client.py (1)
  • QdrantClient (27-3173)
🪛 Ruff (0.14.1)
tests/embed_tests/test_local_inference.py

54-54: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


62-62: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


70-70: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)


78-78: Star-arg unpacking after a keyword argument is strongly discouraged

(B026)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
🔇 Additional comments (6)
tests/embed_tests/test_local_inference.py (6)

6-8: LGTM: Clean temporary directory scaffolding.

The temporary directory setup and cleanup fixture are properly implemented. The module-scoped cleanup ensures the temp directory is removed after all tests in the module complete.

Also applies to: 36-47


81-81: Fixture scope change aligns with new caching strategy.

The scope change from session to module is appropriate given the introduction of module-level SESSION_TMP_DIR and per-model cache directories. This ensures proper isolation between test modules.


352-360: Verify the necessity of mutating document options after creation.

The code mutates the options dictionary of previously created documents (lines 352-360), setting local_files_only based on cache existence. While this may ensure the cache is available for subsequent operations, mutating shared document state can be fragile and make tests order-dependent.

Consider whether these mutations are necessary, or if the documents should be recreated with the correct options for each test phase. If the mutation is intentional to reflect that models are now cached after the first upload, consider adding a comment explaining this behavior.

# After first upload, models are cached locally
for text_doc in (dense_doc_1, dense_doc_2, dense_doc_3):
    text_doc.options["local_files_only"] = Path(text_doc.options["cache_dir"]).exists()

426-444: LGTM: Consistent cache options propagation.

The cache directory and options handling in these test functions follows the same correct pattern: compute cache directory, prepare options dict with conditional local_files_only, then create documents with those options.

Also applies to: 609-622, 685-703, 799-817


1520-1520: Verify the reason for skipping these tests.

Three tests are marked with @pytest.mark.skip but don't include explanations for why they're disabled:

  • test_upsert_batch_with_different_options (line 1520)
  • test_batch_size_propagation (line 1584)
  • test_embed_multimodal (line 1675)

These tests were modified to incorporate the new caching infrastructure but then skipped. This suggests they may be temporarily disabled during the transition.

Please clarify:

  1. Why are these tests skipped?
  2. Is there a plan to re-enable them?
  3. Should the skip markers include a reason string?

Consider using @pytest.mark.skip(reason="...") to document why tests are disabled, or create tracking issues for re-enabling them.

Also applies to: 1584-1584, 1675-1675


874-881: LGTM: Comprehensive cache options integration across test suite.

The remaining test functions have been systematically updated to use the new caching infrastructure. The pattern is consistent and correct throughout:

  1. Compute cache directory using _get_cache_dir_name
  2. Create options dict with cache_dir and conditional local_files_only
  3. Pass options when creating Document and Image objects

The repeated local_files_only checks (e.g., lines 1312-1313, 1449-1450) appropriately handle progressive cache population during test execution.

Also applies to: 995-1014, 1270-1398, 1414-1517

@joein joein merged commit 784b27a into dev Oct 24, 2025
14 checks passed
joein added a commit that referenced this pull request Nov 14, 2025
* new: queryless score 1

* Reduce hf requests (#1092)

* tests: remove redundant test

* new: improve caching, reduce number of hf requests

* fix: fix possible cache_dir key error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants