Skip to content

Comments

fix: sanitize tool arguments in session history#3647

Open
nhangen wants to merge 3 commits intoopenclaw:mainfrom
nhangen:fix/session-sanitization
Open

fix: sanitize tool arguments in session history#3647
nhangen wants to merge 3 commits intoopenclaw:mainfrom
nhangen:fix/session-sanitization

Conversation

@nhangen
Copy link

@nhangen nhangen commented Jan 29, 2026

Pull Request: fix: sanitize tool arguments in session history

📋 Summary

This PR implements robust sanitization for tool arguments in session history messages. It specifically targets potential corruption where tool input arguments contain invalid JSON. The fix detects these malformed inputs and replaces them with an empty object {} during message loading, preventing the entire session from crashing due to InternalError.Algo.InvalidParameter.

🎯 Related Issues

Closes # (Session Crash Bugs)

🚀 What's New

Core Changes

1. Session Transcript Repair

Purpose: To make the agent resilient to malformed session data ("dirty memory") caused by previous model errors or hallucinations.

Implementation:

  • Added sanitizeToolUseArgs function in src/agents/session-transcript-repair.ts.
  • Integrated this sanitization step into the sanitizeToolUseResultPairing pipeline.
  • It parses every toolUse input block; if JSON.parse fails, it catches the error and repairs the block.

Key Code:

// src/agents/session-transcript-repair.ts
if (typeof (block as any).input === "string") {
  try {
    JSON.parse((block as any).input);
    nextContent.push(block);
  } catch {
    // Invalid JSON found in tool args.
    // Replace with empty object to prevent downstream crashes.
    nextContent.push({
      ...block,
      input: {}, // Fixed
      _sanitized: true,
      _originalInput: (block as any).input,
    });
    msgChanged = true;
  }
}

📊 Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 📝 Documentation update
  • 🔧 Configuration change
  • ♻️ Code refactoring (no functional changes)
  • ⚡ Performance improvement
  • 🎨 UI/UX change
  • 🧪 Test coverage improvement
  • 🔒 Security fix

🧪 Testing

Automated Tests

  • Unit tests added/updated
  • Integration tests added/updated
  • All existing tests pass

Test Results:

Verified with a reproduction script (verify-sanitization.js) that constructed a message with a malformed input string "{ bad: json }".

  • Before fix: Crash.
  • After fix: Input replaced with {} and script succeeded.

Manual Testing

Testing Checklist:

  • Tested in development environment
  • Tested with real data/production-like scenarios
  • Tested error scenarios
  • Verified no console errors/warnings

Environments Tested:

  • Development

🚀 Deployment Strategy

Deployment Steps

  1. Merge PR.
  2. Rebuild agent (npm run build).
  3. Restart Moltbot services.

Configuration Changes

None.

🔍 Code Quality

  • ESLint passed (auto-checked by pre-commit hooks)
  • Prettier formatting applied (auto-checked by pre-commit hooks)
  • Commit messages follow conventional commits
  • Code reviewed by AI agent or peer
  • Error handling implemented for edge cases

🔐 Security Considerations

  • Input validation added for user input (Sanitization protects against crashes)

🚦 Status

  • 🔴 Draft - Work in progress
  • 🟡 Ready for Review - Code complete, needs review
  • 🟢 Approved - Ready to merge
  • 🔵 Merged - Deployed to staging
  • ✅ Complete - Deployed to production

Greptile Overview

Greptile Summary

This PR adds a transcript-repair step (sanitizeToolUseArgs) that scans assistant tool-call blocks in session history and normalizes tool arguments: valid JSON strings in input/arguments are parsed into objects, and malformed JSON strings are replaced with {} while tagging the block as sanitized. The sanitization is then run as part of sanitizeToolUseResultPairing before repairing toolResult ordering/duplication.

Overall this strengthens resilience against “dirty” session files that would otherwise crash or be rejected by strict providers when tool-call arguments are not valid JSON.

Confidence Score: 4/5

  • This PR is mostly safe to merge; main behavior change is limited to session-history repair logic.
  • The change is localized and has added tests, but the current parsing accepts non-object JSON (e.g., null/arrays) which can still violate downstream tool schema expectations, and the new warning log may inadvertently leak sensitive tool inputs.
  • src/agents/session-transcript-repair.ts

(4/5) You can add custom instructions or style guidelines for the agent here!

Copilot AI review requested due to automatic review settings January 29, 2026 00:03
@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Jan 29, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements sanitization for tool arguments in session history to prevent crashes caused by malformed JSON in tool input fields. When invalid JSON is detected in tool use blocks, it is replaced with an empty object {} to prevent InternalError.Algo.InvalidParameter errors that would otherwise crash the entire session.

Changes:

  • Added sanitizeToolUseArgs function to detect and repair malformed JSON in tool input arguments
  • Integrated sanitization into the sanitizeToolUseResultPairing pipeline to ensure it runs before tool result pairing repair
  • Preserved original malformed input in _originalInput metadata field for debugging

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nhangen
Copy link
Author

nhangen commented Jan 29, 2026

Working on addressing copilot feedback and passing checks.

Addresses review feedback: parse valid JSON strings, add type guards, add logging, handle arguments alias.
@nhangen nhangen force-pushed the fix/session-sanitization branch from f1a3e18 to ccf00e1 Compare January 29, 2026 15:06
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 3, 2026

Additional Comments (2)

src/agents/session-transcript-repair.ts
[P1] sanitizeToolUseArgs will parse any JSON string (including primitives/arrays) and store it back into input/arguments, even though downstream code/tools typically expect an object.

If toolBlock[inputField] is something like "null", "[]", or "\"str\"", JSON.parse succeeds and you’ll set input to null/array/string, which can still trigger InvalidParameter-style schema/type errors later.

Consider only accepting parsed values where parsed is a non-null plain object; otherwise treat it as malformed and replace with {} (or wrap as { value: parsed } if you need to preserve it).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 99:103

Comment:
[P1] `sanitizeToolUseArgs` will parse any JSON string (including primitives/arrays) and store it back into `input`/`arguments`, even though downstream code/tools typically expect an object.

If `toolBlock[inputField]` is something like `"null"`, `"[]"`, or `"\"str\""`, `JSON.parse` succeeds and you’ll set `input` to `null`/array/string, which can still trigger `InvalidParameter`-style schema/type errors later.

Consider only accepting parsed values where `parsed` is a non-null plain object; otherwise treat it as malformed and replace with `{}` (or wrap as `{ value: parsed }` if you need to preserve it).

How can I resolve this? If you propose a fix, please make it concise.

src/agents/session-transcript-repair.ts
[P2] The warning log includes a snippet of the original tool input (Original: ${sample}), which can leak sensitive data from session history (paths, tokens, prompts, etc.) into logs.

If these logs are shipped/retained, consider logging only metadata (tool name + length + error type) or truncating more aggressively/redacting common secret patterns.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 111:114

Comment:
[P2] The warning log includes a snippet of the original tool input (`Original: ${sample}`), which can leak sensitive data from session history (paths, tokens, prompts, etc.) into logs.

If these logs are shipped/retained, consider logging only metadata (tool name + length + error type) or truncating more aggressively/redacting common secret patterns.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant