fix(embedding): support custom processor input preparation#369
Merged
jundot merged 1 commit intojundot:mainfrom Mar 29, 2026
Merged
fix(embedding): support custom processor input preparation#369jundot merged 1 commit intojundot:mainfrom
jundot merged 1 commit intojundot:mainfrom
Conversation
Why: Qwen3-VL embedding models can be loaded through mlx-embeddings, but oMLX always used the generic processor(texts, ...) path for embedding requests. For custom processors such as qwen3_vl, that positional call is interpreted as image input, which breaks /v1/embeddings even when the model is explicitly treated as an embedding model. What: Detect processors that expose custom embedding input hooks and route embedding requests through prepare_embedding_inputs/prepare_model_inputs instead of the generic tokenizer path. Keep the existing path for standard text processors, and add regression coverage for both compiled and eager execution.
Owner
|
Thanks for the fix @MasakiMu319, the approach looks clean and correct. Routing through I verified the existing text embedding path is not affected by this change. The tests cover both compiled and eager execution with custom processors, which is what i want to see. Merging this. |
AlexTzk
pushed a commit
to AlexTzk/omlx
that referenced
this pull request
Mar 29, 2026
Why: Qwen3-VL embedding models can be loaded through mlx-embeddings, but oMLX always used the generic processor(texts, ...) path for embedding requests. For custom processors such as qwen3_vl, that positional call is interpreted as image input, which breaks /v1/embeddings even when the model is explicitly treated as an embedding model. What: Detect processors that expose custom embedding input hooks and route embedding requests through prepare_embedding_inputs/prepare_model_inputs instead of the generic tokenizer path. Keep the existing path for standard text processors, and add regression coverage for both compiled and eager execution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why
Qwen3-VL embedding models can be loaded through
mlx-embeddings, but oMLX always used the genericprocessor(texts, ...)path for embedding requests. For custom processors such asqwen3_vl, that positional call is interpreted as image input, which breaks/v1/embeddingseven when the model is explicitly treated as an embedding model.Testing
uv run pytest tests/test_embedding.py -k "custom_processor or compiled_path_fallback_on_failure or is_compiled_false_uses_eager_path"Qwen3-VL-Embedding-2B-mxfp8forced toembeddingand verified/v1/embeddingsreturns vectors for single and batch text inputs