Fix #4650: Re-run tool pairing and turn validation after history limiting#4736
Closed
spiceoogway wants to merge 14 commits intoopenclaw:mainfrom
Closed
Fix #4650: Re-run tool pairing and turn validation after history limiting#4736spiceoogway wants to merge 14 commits intoopenclaw:mainfrom
spiceoogway wants to merge 14 commits intoopenclaw:mainfrom
Conversation
…d Opus Resolves openclaw#4315. The slug-generator embedded run was hardcoded to use DEFAULT_MODEL (claude-opus-4-5) regardless of the user's configured agents.defaults.model.primary. This caused unexpected Opus charges on every /new command. Now uses resolveDefaultModelForAgent() to honor the user's configured default model, falling back to DEFAULT_MODEL only when no config exists.
…ixes openclaw#4367) The message processing pipeline had a synchronization bug where limitHistoryTurns() truncated conversation history AFTER repairToolUseResultPairing() had already fixed tool_use/tool_result pairings. This could split assistant messages (with tool_use) from their corresponding tool_result blocks, creating orphaned tool_result blocks that the Anthropic API rejects. This fix calls sanitizeToolUseResultPairing() AFTER limitHistoryTurns() to repair any pairings broken by truncation, ensuring the transcript remains valid before being sent to the LLM API. Changes: - Added import for sanitizeToolUseResultPairing from session-transcript-repair.js - Call sanitizeToolUseResultPairing() on the limited message array - Updated variable name from 'limited' to 'repaired' for clarity
- Enhanced SubagentRunOutcome type with errorType and errorHint fields - Added categorizeError() helper to classify common error patterns: * File system errors (ENOENT, EACCES, etc.) * API/model errors (rate limits, auth failures, invalid requests) * Network errors (connection refused, DNS failures) * Timeout errors * Configuration errors (missing credentials, quota limits) - Updated error emission in agent-runner-execution.ts to categorize errors - Updated subagent-registry.ts to capture and propagate new error fields - Added buildErrorStatusLabel() helper for user-friendly error messages - Error announcements now include error type and remediation hints Example improved messages: - Before: 'failed: unknown error' - After: 'failed (tool error): ENOENT — File or directory not found' This makes subagent failures much easier to understand and debug while maintaining backward compatibility.
- Tests timeout errors (timeout keyword, 'timed out', ETIMEDOUT) - Tests authentication errors (401, unauthorized, 403, forbidden) - Tests rate limit errors (rate limit keyword, HTTP 429) - Tests unknown/unrecognized errors - Tests API/model errors (400, 500, 503, quota, billing) - Tests network errors (ECONNREFUSED, ENOTFOUND, DNS, ENETUNREACH) - Tests file system errors (ENOENT, EACCES, EPERM, EISDIR) - Tests configuration errors (missing key/token) - Tests context/memory errors - Export categorizeError() function for testing - 37 passing tests covering all error categorization paths
Fixes openclaw#4107 Problem: - Cron jobs with systemEvent payloads enqueue events as passive text - Agent sees events but doesn't autonomously process them - Expected behavior: agent should execute complex tasks (spawn subagents, etc.) Root Cause: - systemEvents are injected as 'System: [timestamp] text' messages - Standard heartbeat prompt is passive: 'Read HEARTBEAT.md... reply HEARTBEAT_OK' - No explicit instruction to process and act on systemEvents - Exec completion events already had directive prompt, but generic events didn't Solution: - Added SYSTEM_EVENT_PROMPT constant with directive language - Modified heartbeat runner to detect generic (non-exec) systemEvents - When generic events are present, use directive prompt instead of passive one - Directive prompt explicitly instructs agent to: 1. Check HEARTBEAT.md for event-specific handling 2. Execute required actions (spawn subagent, perform task, etc.) 3. Reply HEARTBEAT_OK only if no action needed Changes: - src/infra/heartbeat-runner.ts: - Added SYSTEM_EVENT_PROMPT constant (lines 100-105) - Modified prompt selection logic to detect generic systemEvents (lines 510-523) - Updated Provider field to reflect 'system-event' context Testing: - Build passes (TypeScript compilation successful) - Pattern follows existing exec-event precedent - Backward compatible (only affects sessions with pending systemEvents) Follow-up: - Add integration test for cron → subagent spawn workflow - Update documentation on systemEvent best practices
…docs Implements Telegram user account support via GramJS/MTProto (openclaw#937). ## What's New - Complete GramJS channel adapter for user accounts (not bots) - Interactive auth flow (phone → SMS → 2FA) - Session persistence via StringSession - DM and group message support - Security policies (allowFrom, dmPolicy, groupPolicy) - Multi-account configuration ## Files Added ### Core Implementation (src/telegram-gramjs/) - auth.ts - Interactive authentication flow - auth.test.ts - Auth flow tests (mocked) - client.ts - GramJS TelegramClient wrapper - config.ts - Config adapter for multi-account - gateway.ts - Gateway adapter (poll/send) - handlers.ts - Message conversion (GramJS → openclaw) - handlers.test.ts - Message conversion tests - setup.ts - CLI setup wizard - types.ts - TypeScript type definitions - index.ts - Module exports ### Configuration - src/config/types.telegram-gramjs.ts - Config schema ### Plugin Extension - extensions/telegram-gramjs/index.ts - Plugin registration - extensions/telegram-gramjs/src/channel.ts - Channel plugin implementation - extensions/telegram-gramjs/openclaw.plugin.json - Plugin manifest - extensions/telegram-gramjs/package.json - Dependencies ### Documentation - docs/channels/telegram-gramjs.md - Complete setup guide (14KB) - GRAMJS-PHASE1-SUMMARY.md - Implementation summary ### Registry - src/channels/registry.ts - Added telegram-gramjs to CHAT_CHANNEL_ORDER ## Test Coverage - ✅ Auth flow with phone/SMS/2FA (mocked) - ✅ Phone number validation - ✅ Session verification - ✅ Message conversion (DM, group, reply) - ✅ Session key routing - ✅ Command extraction - ✅ Edge cases (empty messages, special chars) ## Features Implemented (Phase 1) - ✅ User account authentication via MTProto - ✅ DM message send/receive - ✅ Group message send/receive - ✅ Reply context preservation - ✅ Security policies (pairing, allowlist) - ✅ Multi-account support - ✅ Session persistence - ✅ Command detection ## Next Steps (Phase 2) - Media support (photos, videos, files) - Voice messages and stickers - Message editing and deletion - Reactions - Channel messages ## Documentation Highlights - Getting API credentials from my.telegram.org - Interactive setup wizard walkthrough - DM and group policies configuration - Multi-account examples - Rate limits and troubleshooting - Security best practices - Migration guide from Bot API Closes openclaw#937 (Phase 1)
Implements WebSocket ping/pong keepalive mechanism to prevent random disconnects with close code 1006 (abnormal closure). Changes: - Add 20-second ping interval when WebSocket connection is open - Monitor pong responses for connection health - Enhanced error and close event logging with detailed diagnostics - Clean up ping interval when connection closes This fix addresses the root cause of issue openclaw#4142 where network intermediaries (NAT, firewalls, proxies) timeout idle WebSocket connections. The 20-second ping interval is industry standard practice for preventing idle connection timeouts. Fixes openclaw#4142
- Prefix unused variable _isExecEvent in heartbeat-runner.ts - Remove unused TelegramGramJSAccountConfig import from config.ts - Remove unused SessionOptions import from client.ts - Prefix unused proxy parameter as _proxy in client.ts - Prefix unused parameters _gramjsContext and _botUsername in handlers.ts - Remove unused OpenClawConfig import from types.ts - Remove unused AuthState import from auth.test.ts Fixes lint CI failure on PR openclaw#4465
Fixes openclaw#4529 The moltbot compatibility package was broken because bin/moltbot.js was missing. This file is referenced in package.json but was never created during the openclaw rename. This commit adds the missing binary that: - Forwards all commands to openclaw - Shows clear deprecation warning - Guides users to migrate to openclaw package Without this fix, users who upgraded from moltbot to openclaw by git pull could not use the gateway because their global moltbot installation had no working binary. Test: node packages/moltbot/bin/moltbot.js --version (requires built openclaw)
…ory limiting After limitHistoryTurns or compaction operations, the message history can end up with: 1. Orphaned tool_result blocks (tool_use IDs not in preceding message) 2. Consecutive assistant messages that violate API format requirements Root cause: sanitizeToolUseResultPairing and turn validation functions (validateGeminiTurns, validateAnthropicTurns) were only running BEFORE limitHistoryTurns, but history limiting can create new issues by cutting off tool_use blocks or creating consecutive assistant messages during compaction. Solution: Re-run both sanitizeToolUseResultPairing and turn validation AFTER limitHistoryTurns in both main code paths: - src/agents/pi-embedded-runner/run/attempt.ts - src/agents/pi-embedded-runner/compact.ts This ensures tool_use/tool_result pairing is maintained and consecutive assistant/user messages are properly merged after any history modification. Changes: - Added transcriptPolicy checks before applying repairs (respects config) - Re-run validateGeminiTurns after sanitizeToolUseResultPairing - Re-run validateAnthropicTurns after validateGeminiTurns - Added import for sanitizeToolUseResultPairing in compact.ts
…sanitization) Add comprehensive integration tests demonstrating that re-running sanitization after limitHistoryTurns correctly handles: - Orphaned tool_result blocks (when tool_use is separated by compaction) - Consecutive assistant messages - Complex scenarios with both issues - Preservation of valid tool pairings These tests verify the fix in PR openclaw#4736 works as intended.
78e82a9 to
a5f1e14
Compare
Contributor
|
CLAWDINATOR FIELD REPORT // PR Closure I am CLAWDINATOR — cybernetic crustacean, maintainer triage bot for OpenClaw. I was sent from the future to keep this repo shipping clean code. This PR has a diff size inconsistent with the claimed fix — likely stale base branch or committed artifacts. Please rebase cleanly on main and resubmit if the fix is still needed. TERMINATED. 🤖 This is an automated message from CLAWDINATOR, the OpenClaw maintainer bot. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #4650 - Tool Use/Result Pairing and Consecutive Assistant Messages After History Limiting
Problem
After
limitHistoryTurnsor compaction operations, the message history can end up with:tool_resultblocks that referencetool_useIDs not present in the immediately preceding assistant messageThis causes the Anthropic API to reject requests with errors like:
Root Cause
The
sanitizeToolUseResultPairingand turn validation functions (validateGeminiTurns,validateAnthropicTurns) were only running beforelimitHistoryTurns. When history is limited:tool_useblock while keeping the correspondingtool_resultin a later message (orphaned tool result)Solution
Re-run both
sanitizeToolUseResultPairingand turn validation afterlimitHistoryTurnsin both main code paths:src/agents/pi-embedded-runner/run/attempt.tssrc/agents/pi-embedded-runner/compact.tsThis ensures:
Changes
attempt.ts: Added re-validation after
limitHistoryTurnssanitizeToolUseResultPairing(respectstranscriptPolicy.repairToolUseResultPairing)validateGeminiTurns(respectstranscriptPolicy.validateGeminiTurns)validateAnthropicTurns(respectstranscriptPolicy.validateAnthropicTurns)compact.ts: Added same re-validation logic after
limitHistoryTurnssanitizeToolUseResultPairingTesting
Additional Notes