Skip to content

LLM tool calls fail with JSON parsing errors due to unescaped control characters in arguments #2892

@bjulian5

Description

@bjulian5

Describe the bug

LLM providers, particularly Amazon Bedrock models for Claude, occasionally output raw control characters (unescaped newlines, tabs, etc.) within JSON string values in tool call arguments. This causes JSON parsing failures and prevents tool execution, resulting in user-facing errors.

To Reproduce
Steps to reproduce the behavior:

  1. Configure an Amazon Bedrock Claude model to generate tool calls with multi-line content
  2. Convert the Bedrock response format to OpenAI specification (common practice for API standardization)
  3. Process the converted response through Goose's OpenAI format handler
  4. LLM generates tool call arguments containing literal newlines or other control characters during the conversion process
  5. JSON parsing fails with error: "Could not interpret tool use parameters"

Expected behavior
Tool calls should execute successfully even when LLMs output control characters in JSON string values. The system should automatically escape these characters to produce valid JSON.

Example of problematic LLM output after format conversion:

{
  "command": "echo 'Hello
World'"
}

Expected processed output:

{
  "command": "echo 'Hello\\nWorld'"
}

Screenshots
N/A - This is a JSON parsing error that manifests in logs/error messages.

Please provide following information:

  • OS & Arch: Multiple (affects all platforms)
  • Interface: Both UI/CLI
  • Version: Current development version
  • Extensions enabled: Any extension that uses tool calls
  • Provider & Model: Primarily Amazon Bedrock Claude models, but affects any LLM provider processed through format conversion

Root Cause Analysis

  • Amazon Bedrock models for Claude are particularly prone to generating unescaped control characters in tool call JSON arguments
  • When LLM responses are converted to OpenAI specification format for consistency, raw control characters from the original provider response are preserved
  • Goose's response parsing logic processes these converted responses without preprocessing to handle malformed control characters
  • Control characters like \n, \r, \t appear literally in the JSON instead of being properly escaped

Impact

  • Tool calls fail unexpectedly, breaking user workflows
  • Error messages are unclear about the root cause
  • Affects reliability across multiple LLM providers due to format conversion processes
  • Particularly impacts users of Amazon Bedrock Claude models

Technical Context
The issue occurs in the response_to_message() functions that process OpenAI-formatted responses:

  • crates/goose-llm/src/providers/formats/openai.rs
  • crates/goose-llm/src/providers/formats/databricks.rs
  • crates/goose/src/providers/formats/openai.rs
  • crates/goose/src/providers/formats/databricks.rs

These functions extract the arguments string from tool calls and attempt to parse it as JSON without preprocessing to handle malformed control characters that may have been preserved during format conversion.

Proposed Solution
Implement a preprocessing step that:

  1. Escapes literal control characters in JSON argument strings before parsing
  2. Preserves structural JSON elements (quotes, braces, etc.)
  3. Provides better error messages for debugging
  4. Applies consistently across format handlers to handle responses from various providers

Additional context
This bug affects systems that standardize LLM provider responses to OpenAI format. The fix needs to be applied across format handlers to ensure consistent behavior regardless of the underlying LLM provider that generated the original response.

Metadata

Metadata

Assignees

No one assigned

    Labels

    p1Priority 1 - High (supports roadmap)

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions