Skip to content

fix: enable Gemma4 tool calling via mlx-vlm parser infrastructure#561

Closed
TipKnuckle wants to merge 1 commit intojundot:mainfrom
TipKnuckle:feat/gemma4-tool-calling
Closed

fix: enable Gemma4 tool calling via mlx-vlm parser infrastructure#561
TipKnuckle wants to merge 1 commit intojundot:mainfrom
TipKnuckle:feat/gemma4-tool-calling

Conversation

@TipKnuckle
Copy link
Copy Markdown
Contributor

Fixes tool calling for Gemma4 VLM models by delegating to mlx-vlm's own parser infrastructure rather than mlx-lm's.

Problem

_inject_tool_calling was importing _infer_tool_parser directly from mlx_lm.tokenizer_utils, which has no knowledge of Gemma4's <|tool_call> template marker. It returned None for Gemma4, so has_tool_calling was never set and raw tool call XML was passed through into the context instead of being parsed.

Change

Switch to mlx-vlm's own _infer_tool_parser and load_tool_module (available since mlx-vlm commit 43b9b20, already pinned by this project):

  • mlx_vlm.tool_parsers._infer_tool_parser calls mlx-lm's version first, then checks VLM-specific extras — including <|tool_call>gemma4. Strictly a superset, no regressions for other models.
  • mlx_vlm.tool_parsers.load_tool_module prefers mlx_vlm.tool_parsers.* over mlx_lm.tool_parsers.*, so Gemma4 gets the correct parser (right tokens, right escape format) rather than the unrelated function_gemma parser from mlx-lm.
  • The existing try/except ImportError guard means this degrades gracefully if run against an older mlx-vlm without the tool_parsers package.

Any future VLM model that mlx-vlm adds tool support for will be picked up automatically with no further changes needed here.

@TipKnuckle TipKnuckle force-pushed the feat/gemma4-tool-calling branch from 62bc4d0 to a2480fc Compare April 3, 2026 20:23
@TipKnuckle
Copy link
Copy Markdown
Contributor Author

Superseded by #565, which includes a complete rewrite of Gemma4 tool calling using a proper message extractor pattern.

The approach here (delegating to mlx-vlm's tool parser infrastructure) turned out to be insufficient — the root cause was that the Gemma4 chat template has no handling for role=tool messages at all. Tool results must be passed as tool_responses on a model-role turn. That transformation, plus the required fixes to _merge_consecutive_roles and _drop_void_assistant_messages, are all in #565.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant