fix(nanogpt): implement XML tool call parsing for NanoGPT transformer#1211
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces several significant updates across the backend, frontend, and documentation. Key additions include support for the Fireworks AI provider, the implementation of a "Compact" response endpoint for the OpenAI Responses API, and enhanced dashboard statistics with time-period filtering. The prompt system was expanded to support API Key-based activation conditions, and the model registry was updated with numerous new models and reasoning capability metadata. Additionally, the build system now supports FreeBSD, and a new axonhub-cli skill was added for GraphQL-based management. I have no feedback to provide as there were no review comments.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the xmltools package to parse XML-based tool calls from LLM responses, supporting both standard <use_tool> tags and an alternative format where tool names are used as tags. The NanoGPT model implementation is updated to utilize this parser during message conversion, and extensive integration tests are added. Feedback suggests improving the robustness of the XML parser to prevent data loss on malformed input and adding fallbacks for JSON marshaling failures in the alternative format parser.
| if err != nil { | ||
| break | ||
| } |
There was a problem hiding this comment.
The current error handling for decoder.Token() can lead to silent data loss. If an XML syntax error occurs, the loop breaks, and any content following the error is discarded. This could result in parts of the model's response being silently ignored, which can be a significant issue.
To make the parser more robust against malformed XML from the LLM, you should handle syntax errors by treating the remainder of the content as plain text instead of dropping it. You can achieve this by recording the decoder's offset before attempting to read a token, and if an error occurs, append the rest of the input from that offset to the textBuilder.
You'll need to add offset := decoder.InputOffset() at the beginning of the for loop for this suggestion to work.
if err != nil {
if err != io.EOF {
// Malformed XML, treat the rest of the content as text.
textBuilder.WriteString(content[offset:])
}
break
}| if jsonBytes, err := json.Marshal(v); err == nil { | ||
| tool.Arguments[k] = string(jsonBytes) | ||
| } |
There was a problem hiding this comment.
In the default case of the switch statement, if json.Marshal(v) fails, the argument is silently dropped. While this is an edge case, it could lead to tool calls with missing arguments without any warning, making debugging difficult. For a robust parser, it's better to have a fallback mechanism or at least log the failure.
if jsonBytes, err := json.Marshal(v); err == nil {
tool.Arguments[k] = string(jsonBytes)
} else {
// Fallback for types that cannot be marshaled to JSON.
tool.Arguments[k] = fmt.Sprintf("%v", v)
}- Document error handling behavior in ToOpenAIMessage() docstring - Fix race condition in tool call ID generation using content hash - Support any valid JSON (not just objects) in alternative format - Add fallback for unknown tool names with uppercase-starting tags - Document content clearing logic behavior All review issues from PR looplj#1211 have been addressed.
- Fix streaming aggregator index assignment bug (MEDIUM): - Check if tool call exists before incrementing nextToolIndex - Prevents non-sequential indices when mixing ID/no-ID chunks - Document XML parser leniency (LOW): - Add comment explaining uppercase letter heuristic - Trade-off between compatibility and false positives - Return errors from Parse() instead of swallowing (LOW): - Now returns error when standard parsing fails - Callers can make informed decisions about parsing results - Increase hash length to reduce collision risk (Very Low): - Changed from 8 to 16 hex characters (32 to 64 bits) - Better safety margin for high-volume scenarios - Add streaming documentation to model.go (LOW): - Document deterministic ID generation strategy - Explain streaming behavior and idempotency
…matching validation
…, JSON in content
…>Y</content></Write>
…rPattern single quotes
9ed9002 to
c79e685
Compare
|
Had to re-work this multiple times to find a solution that largely works to my liking. Basically, NanoGPT proxies to multiple providers in a round-robin like way. Each provider may have different quants, so model quality (and tool call) can vary significantly, with many GLM 5 models using XML tool calling. The PR description notes sometimes there are just fully malformed tool calls, which would be a cat and mouse game to fix in axonhub, so I kept the implementation focused on just converting XML tool calls to native tool calls. |
|
Thanks for the great idea. |
Summary
This PR adds XML tool call parsing support for the NanoGPT transformer to handle multiple malformed XML formats that AI models output, converting them to OpenAI-compatible
tool_calls.Problem
NanoGPT models output tool calls in various inconsistent XML formats instead of the standard OpenAI format:
These formats were not being parsed, causing test failures and missing tool calls.
Solution
Implemented a regex-based XML parser in
llm/transformer/nanogpt/xml_parser.gothat:Pre-checks content for XML tool call indicators (
MaybeHasXMLToolCalls)Normalizes common malformations (
normalizeXML):<Write attr="..."\n→<Write attr="...">\n</Write_file>→</Write></use_use>→</use_tool><Write>content</use_tool>→<Write>content</Write>Parses multiple XML formats:
<Write file_path="X" content="Y"/><Write file_path="X">content</Write><Write>{"file_path":"X"}</use_tool><Write><file_path>X</file_path><content>Y</content></Write><Write_File>{...}</Write_File>Extracts tool name and arguments, converting to OpenAI format
Changes
llm/transformer/nanogpt/model.go(+17 lines): Add XML parsing call inToOpenAIMessage()llm/transformer/nanogpt/xml_parser.go(+356 lines): New regex-based XML parserKey Features
[^<]instead of[\s\S]*?where possible)Out of Scope
This PR intentionally does NOT handle the following issues, as they are considered model/provider bugs rather than parsing issues:
Tool name concatenation: When the model generates malformed tag names like
<globglobglobls>or<Write_file_path=...>, these are model output errors. The parser correctly rejects these as unknown tool names.Missing closing tags in content: If the model generates
<Write>contentwithout any closing tag, this is unrecoverable malformed XML. The parser handles some common cases (like unclosed opening tags) but cannot guess missing closing tags.Nested tool calls: The parser does not support tool calls nested inside other tool calls (e.g.,
<Write><use_tool>...</use_tool></Write>). This pattern has not been observed in production.CDATA sections: XML CDATA sections (
<![CDATA[...]]>) are not supported as they have not been observed in model outputs.XML namespaces: Namespaced tags like
<tools:Write>are not supported as they have not been observed.The parser is designed to be lenient for common malformations but not attempt to fix fundamentally broken model output. Errors in model generation should be reported upstream to the model provider.