Skip to content

new: add b64 encoding function#1031

Merged
joein merged 6 commits into
devfrom
b64-utils
Jul 15, 2025
Merged

new: add b64 encoding function#1031
joein merged 6 commits into
devfrom
b64-utils

Conversation

@joein

@joein joein commented Jun 24, 2025

Copy link
Copy Markdown
Member

No description provided.

@netlify

netlify Bot commented Jun 24, 2025

Copy link
Copy Markdown

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 846d434
🔍 Latest deploy log https://app.netlify.com/projects/poetic-froyo-8baba7/deploys/687675efe4a3dc00081e9f54
😎 Deploy Preview https://deploy-preview-1031--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai

coderabbitai Bot commented Jun 24, 2025

Copy link
Copy Markdown

Warning

Rate limit exceeded

@joein has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 27 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 1a6791f and 846d434.

📒 Files selected for processing (1)
  • tests/test_remote_inference.py (1 hunks)
📝 Walkthrough

Walkthrough

A new utility function, read_base64, has been added to qdrant_client/embed/utils.py. This function reads a file from a given path, verifies its existence, and returns its base64-encoded content as a string, raising a FileNotFoundError if the file does not exist. Supporting imports for base64, Path, and Union have been included. Additionally, a new test module, tests/embed_tests/test_utils.py, introduces tests for read_base64, checking correct encoding and error handling. The tests/utils.py file now defines a TESTS_PATH constant to represent the directory path of the test utilities. Furthermore, a new test function test_remove_inference_image was added to tests/test_remove_inference.py, which is marked to be skipped by default and tests remote inference image embedding functionality with the Qdrant client.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
qdrant_client/embed/utils.py (1)

71-79: Consider renaming function to reflect its generic file capability.

The function implementation is correct and handles file operations properly. However, the function name to_base64 with parameter image_path suggests it's specifically for images, but the implementation works for any file type.

Consider renaming the parameter to file_path to better reflect its generic capability:

-def to_base64(image_path: Union[str, Path]) -> str:
-    """Convert a file path to a base64 encoded string."""
-    path = Path(image_path)
+def to_base64(file_path: Union[str, Path]) -> str:
+    """Convert a file path to a base64 encoded string."""
+    path = Path(file_path)

The error handling and base64 encoding logic are implemented correctly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf515ae and 03adb2f.

📒 Files selected for processing (3)
  • qdrant_client/embed/utils.py (2 hunks)
  • tests/embed_tests/test_utils.py (1 hunks)
  • tests/utils.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (4)
tests/utils.py (1)

3-6: LGTM! Clean addition of test path constant.

The addition of the Path import and TESTS_PATH constant follows good practices for test utilities. This provides a reliable way to reference test data files.

qdrant_client/embed/utils.py (1)

1-3: LGTM! Necessary imports added correctly.

The imports are appropriate for the new base64 encoding functionality.

tests/embed_tests/test_utils.py (2)

1-11: LGTM! Well-structured test setup.

The imports and test data path setup are correct and follow good testing practices.


12-27: Verify test data file exists before running tests.

The test logic is comprehensive and correctly validates both successful encoding and error handling. However, the test assumes the existence of tests/embed_tests/misc/image.jpeg without verification.

Consider adding a check to ensure the test data file exists:

def test_image_path_to_b64():
    # Test with a valid image file
    image_path = Path(EMBED_TESTS_DATA / "image.jpeg")
+    assert image_path.exists(), f"Test image file not found: {image_path}"
    original_bytes = image_path.read_bytes()

The test correctly validates that:

  • The function returns a string
  • The base64 encoding is reversible (decoded bytes match original)
  • FileNotFoundError is raised for non-existent files

Please verify that the test image file exists at the expected location:

#!/bin/bash
# Check if the test image file exists
fd -t f "image.jpeg" tests/embed_tests/misc/

Comment thread qdrant_client/embed/utils.py Outdated
return converted_paths


def to_base64(image_path: Union[str, Path]) -> str:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a different name for this function?

Is it agnostic to file type? If yes, it should be image_path -> file_path, and read_base64
If it is specific to iamges, I would use something like base64_image

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is agnostic to file type, though, I don't think we expect it to be used in qdrant-client with anything but image files

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/test_remove_inference.py (3)

64-66: Consider more specific exception handling.

The broad except Exception clause catches all exceptions without being specific about expected error types. For test code, consider either:

  1. Catching specific expected exceptions (e.g., connection errors, API errors)
  2. Removing the try-catch block to let exceptions propagate naturally for better test debugging
-    try:
-        client.upsert(
+    client.upsert(
             collection_name=collection_name,
             points=[
                 models.PointStruct(
                     id=1,
                     vector=models.Image(
                         image=image_url,
                         model=model_name,
                     ),
                 ),
-                # models.PointStruct(
-                #     id=2,
-                #     vector=models.Image(
-                #         image=read_base64(image_path),
-                #         model=model_name,
-                #     ),
-                # ),
-            ]
-        )
-    except Exception as e:
-        print(e)
-        raise e
+            ]
+        )

55-61: Consider enabling the local file comparison functionality.

The commented-out code suggests the test is intended to compare remote URL inference with local file inference using read_base64. The test comment on line 24 reinforces this intention.

If the read_base64 functionality is ready, consider uncommenting this code to complete the test's intended purpose. If it's not ready, consider adding a TODO comment explaining the current status.


16-16: Consider clarifying the test name.

The function name test_remove_inference_image might be confusing as the test appears to be testing image embedding functionality rather than "remove inference." Consider renaming to something more descriptive like test_image_embedding_inference or test_remote_image_inference.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3139eb2 and d5b1d11.

📒 Files selected for processing (1)
  • tests/test_remove_inference.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
tests/test_remove_inference.py

6-6: qdrant_client.embed.utils.read_base64 imported but unused

Remove unused import: qdrant_client.embed.utils.read_base64

(F401)

🪛 Flake8 (7.2.0)
tests/test_remove_inference.py

[error] 6-6: 'qdrant_client.embed.utils.read_base64' imported but unused

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
🔇 Additional comments (2)
tests/test_remove_inference.py (2)

6-6: Clarify the unused import or enable the functionality.

The read_base64 import is flagged as unused by static analysis tools, but the commented-out code on lines 55-61 shows its intended use. Consider either:

  1. Uncommenting the code to test local file functionality alongside remote URL inference
  2. Removing the import temporarily if the feature isn't ready
  3. Adding a comment explaining why the import is present but unused

1-67: Well-structured integration test with room for improvement.

The test follows good practices with:

  • Appropriate skip marker for integration tests requiring external dependencies
  • Environment variable configuration
  • Proper collection cleanup and setup

The main areas for improvement are the cross-platform compatibility issue and completing the intended comparison functionality between remote URL and local file inference.

Comment thread tests/test_remove_inference.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5b1d11 and 1a6791f.

📒 Files selected for processing (1)
  • tests/test_remove_inference.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
🔇 Additional comments (1)
tests/test_remove_inference.py (1)

1-11: LGTM! Clean imports and proper environment variable usage.

The imports are well-organized and include all necessary dependencies for the test functionality. Using environment variables for configuration is a good practice for test flexibility.

Comment on lines +47 to +65
client.upsert(
collection_name=collection_name,
points=[
models.PointStruct(
id=1,
vector=models.Image(
image=image_url,
model=model_name,
),
),
models.PointStruct(
id=2,
vector=models.Image(
image=read_base64(image_path),
model=model_name,
),
),
]
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add test assertions to verify the functionality.

The test function performs setup and data insertion but lacks assertions to verify the expected behavior. It should validate that:

  1. The upsert operations were successful
  2. The embeddings from URL and local file are comparable
  3. The read_base64 utility function works correctly

Add verification logic:

     client.upsert(
         collection_name=collection_name,
         points=[
             models.PointStruct(
                 id=1,
                 vector=models.Image(
                     image=image_url,
                     model=model_name,
                 ),
             ),
             models.PointStruct(
                 id=2,
                 vector=models.Image(
                     image=read_base64(image_path),
                     model=model_name,
                 ),
             ),
         ]
     )
+    
+    # Verify points were inserted successfully
+    collection_info = client.get_collection(collection_name)
+    assert collection_info.points_count == 2
+    
+    # Verify both points can be retrieved
+    points = client.retrieve(collection_name, ids=[1, 2])
+    assert len(points) == 2
+    
+    # Optionally verify embeddings are similar (since they're the same image)
+    search_results = client.search(
+        collection_name=collection_name,
+        query_vector=points[0].vector,
+        limit=2
+    )
+    assert len(search_results) == 2
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
client.upsert(
collection_name=collection_name,
points=[
models.PointStruct(
id=1,
vector=models.Image(
image=image_url,
model=model_name,
),
),
models.PointStruct(
id=2,
vector=models.Image(
image=read_base64(image_path),
model=model_name,
),
),
]
)
client.upsert(
collection_name=collection_name,
points=[
models.PointStruct(
id=1,
vector=models.Image(
image=image_url,
model=model_name,
),
),
models.PointStruct(
id=2,
vector=models.Image(
image=read_base64(image_path),
model=model_name,
),
),
]
)
# Verify points were inserted successfully
collection_info = client.get_collection(collection_name)
assert collection_info.points_count == 2
# Verify both points can be retrieved
points = client.retrieve(collection_name, ids=[1, 2])
assert len(points) == 2
# Optionally verify embeddings are similar (since they're the same image)
search_results = client.search(
collection_name=collection_name,
query_vector=points[0].vector,
limit=2
)
assert len(search_results) == 2
🤖 Prompt for AI Agents
In tests/test_remove_inference.py between lines 47 and 65, the test inserts data
but lacks assertions to verify outcomes. Add assertions to confirm the upsert
operation succeeded by checking the response status or result. Then retrieve the
inserted points and assert that embeddings from the URL and local file are
similar within an acceptable tolerance. Also, add an assertion to verify that
the read_base64 function returns the expected base64 string for the local image
file.

Comment on lines +28 to +35
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
image_path = tmp_file.name

with httpx.Client() as httpx_client:
response = httpx_client.get(image_url)
with open(image_path, "wb") as f:
f.write(response.content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add cleanup for temporary file and error handling.

While the cross-platform temporary file approach is good, the test is missing:

  1. Cleanup code to delete the temporary file after test completion
  2. Error handling for HTTP requests and file operations

Add cleanup and error handling:

+import os
+
 def test_remove_inference_image():
+    image_path = None
+    try:
         client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY, cloud_inference=True)
         collection_name = "image_embeddings"
         model_name = "Qdrant/clip-ViT-B-32-vision"
 
         image_url = "https://qdrant.tech/example.png"
 
         # Compare inference of image exposed via url and local file
         # So download image to local file and compare results
 
         with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
             image_path = tmp_file.name
 
         with httpx.Client() as httpx_client:
             response = httpx_client.get(image_url)
+            response.raise_for_status()  # Raise exception for bad status codes
             with open(image_path, "wb") as f:
                 f.write(response.content)
+    finally:
+        # Clean up temporary file
+        if image_path and os.path.exists(image_path):
+            os.unlink(image_path)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
image_path = tmp_file.name
with httpx.Client() as httpx_client:
response = httpx_client.get(image_url)
with open(image_path, "wb") as f:
f.write(response.content)
import os
def test_remove_inference_image():
image_path = None
try:
client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY, cloud_inference=True)
collection_name = "image_embeddings"
model_name = "Qdrant/clip-ViT-B-32-vision"
image_url = "https://qdrant.tech/example.png"
# Compare inference of image exposed via url and local file
# So download image to local file and compare results
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp_file:
image_path = tmp_file.name
with httpx.Client() as httpx_client:
response = httpx_client.get(image_url)
response.raise_for_status() # Raise exception for bad status codes
with open(image_path, "wb") as f:
f.write(response.content)
finally:
# Clean up temporary file
if image_path and os.path.exists(image_path):
os.unlink(image_path)
🤖 Prompt for AI Agents
In tests/test_remove_inference.py around lines 28 to 35, the temporary file
created for the image is not deleted after the test, and there is no error
handling for the HTTP request or file writing. To fix this, add a try-finally
block around the HTTP request and file write operations to ensure the temporary
file is deleted in the finally clause. Also, add exception handling for the HTTP
request and file operations to catch and handle potential errors gracefully.

@joein joein merged commit e823402 into dev Jul 15, 2025
14 checks passed
joein added a commit that referenced this pull request Jul 18, 2025
* new: add b64 encoding function

* tests: remove garbage

* refactor base64 function name and params

* manual test for remove iamge inference with base64

* use tempfile

* fix: rename remove_inference->remote_inference

---------

Co-authored-by: generall <[email protected]>
@coderabbitai coderabbitai Bot mentioned this pull request Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants