Skip to content

Gemma 4 tool parser breaks on nested arguments (arrays/objects) #914

@TipKnuckle

Description

@TipKnuckle

Bug: Gemma 4 tool parser breaks on nested arguments (arrays/objects)

The Gemma 4 tool call parser in mlx_vlm/tool_parsers/gemma4.py has two bugs that cause it to produce malformed output whenever a tool's arguments contain nested structures (arrays of objects, nested objects, etc.). This makes tools with complex schemas unusable.

Root Cause 1: Non-greedy regex truncates arguments

The regex used to extract tool call arguments:

_tool_call_regex = re.compile(r"call:(\w+)\{(.*?)\}", re.DOTALL)

(.*?) is non-greedy, so it matches up to the first } it finds. When arguments contain nested objects (e.g. [{key:val}]), the inner } terminates the match early, silently discarding everything after it.

Example — given this model output:

call:edit{path:<|"|>test.txt<|"|>,edits:[{newText:<|"|>orange<|"|>,oldText:<|"|>apple<|"|>}]}

The regex captures only:

path:<|"|>test.txt<|"|>,edits:[{newText:<|"|>orange<|"|>,oldText:<|"|>apple<|"|>

The trailing ]} (closing the array and outer object) is lost.

Root Cause 2: Flat key-value parser can't handle nested structures

Even if the regex captured the full string, the parser splits on : to find keys and on , to find value boundaries. It has no concept of nesting depth, so an array-of-objects value like [{newText:<|"|>orange<|"|>,oldText:<|"|>apple<|"|>}] gets split on the inner comma, producing two separate top-level keys instead of one nested value.

Observed behavior

For a tool call like:

call:edit{edits:[{newText:<|"|>orange<|"|>,oldText:<|"|>apple<|"|>}],path:<|"|>test.txt<|"|>}

The parser produces:

{
  "edits": "[{newText:<|\"|>orange<|\"|>",
  "oldText": "apple"
}
  • edits gets a garbage string fragment instead of an array
  • oldText is promoted to a top-level key (it should be nested inside edits[0])
  • path is lost entirely (truncated by the regex)

Expected behavior

{
  "edits": [{"newText": "orange", "oldText": "apple"}],
  "path": "test.txt"
}

Reproduction

from mlx_vlm.tool_parsers.gemma4 import parse_tool_call

# Any tool call with nested objects/arrays will fail
text = 'call:edit{path:<|"|>test.txt<|"|>,edits:[{newText:<|"|>orange<|"|>,oldText:<|"|>apple<|"|>}]}'
result = parse_tool_call(text)
print(result)
# Produces: {'name': 'edit', 'arguments': {'edits': '[{newText:<|"|>orange<|"|>', 'oldText': 'apple'}}
# Missing: path, edits is malformed

Fix needed

The parser needs to be replaced with a recursive descent parser that:

  1. Tracks brace/bracket nesting depth to find the correct outer } (or uses balanced-group matching)
  2. Recursively parses nested {...} objects and [...] arrays instead of flat comma-splitting

Environment

  • mlx-vlm version: 0.4.4 (commit 0ea9088)
  • Parser file: mlx_vlm/tool_parsers/gemma4.py (introduced in commit 43b9b20)
  • Affected: any tool with array or object parameters (very common in agentic/MCP use cases)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions