-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Tool-pair summarization creates hidden messages that are silently deleted on next client round-trip #7413
Description
Summary
When tool-pair summarization fires, it creates server-side hidden messages (userVisible: false, agentVisible: true) and mutates prior message metadata without emitting a HistoryReplaced event. The UI never learns about these changes. On the next user turn, the UI sends conversation_so_far containing its stale copy of the conversation, and the server calls replace_conversation with it — permanently deleting the hidden summaries and potentially reverting metadata changes the server just made.
This is in contrast to full compaction (compact_messages), which correctly emits HistoryReplaced and keeps the UI in sync.
Affected Code
crates/goose/src/agents/agent.rs— tool-pair summarization block (~L1580–1608): callsupdate_message_metadataandadd_messagebut never emitsAgentEvent::HistoryReplacedcrates/goose-server/src/routes/reply.rs(~L288–305): unconditionally callsreplace_conversationwith client-suppliedconversation_so_farcrates/goose/src/agents/agent.rs(~L951, ~L1034, ~L1493): the three places whereHistoryReplacedIS emitted — tool-pair summarization is the missing fourth
Reproduction
Setup: Set GOOSE_TOOL_CALL_CUTOFF: 3 in ~/.config/goose/config.yaml to make summarization trigger quickly.
-
Start a fresh session
-
Send a message that forces chained tool calls:
"List the files in /tmp, read the 3 most recently modified ones, and tell me what you find"
-
After the response, query the session DB:
SELECT message_id, role, metadata_json FROM messages WHERE session_id = (SELECT id FROM sessions ORDER BY updated_at DESC LIMIT 1) ORDER BY created_timestamp;
Observe: a new
userVisible: false, agentVisible: truemessage (the hidden summary) and one or two messages now markedagentVisible: false. -
Send any follow-up message ("thanks")
-
Query the DB again
Expected: hidden summary persists; agentVisible: false metadata preserved.
Actual: hidden summary is deleted; server had to re-run summarization from scratch. Other messages also dropped.
Evidence
Captured from a live test session:
After turn 1:
msg_6df69722 assistant agentVisible:false ← summarized, hidden from AI
msg_7e3e673e user agentVisible:false ← summarized, hidden from AI
msg_803f5a4f user userVisible:false, agentVisible:true ← hidden summary
After sending "thanks":
msg_011qCnLc assistant agentVisible:false ← different message ID, msg_6df69722 gone
msg_7e3e673e user agentVisible:false ← preserved
msg_cc2f0950 user userVisible:false, agentVisible:true ← brand new summary
msg_803f5a4f is completely gone. The server recreated the work it just did.
Impact
- Tool-pair summarization silently undoes itself every turn
- Agent context is not compressed as intended — tool pairs the system tried to retire may re-enter the context window
- In long sessions where stale tool results (e.g. failed extension checks) were being compressed away, those results can persist in context longer than expected, contributing to stale reasoning loops
- Severity: medium. No data loss visible to the user; the system partially self-heals by re-running summarization. But the intended token economy and context hygiene of tool-pair summarization is effectively broken.
Root Cause
maybe_summarize_tool_pair in agent.rs calls update_message_metadata and session_manager.add_message but does not emit AgentEvent::HistoryReplaced. The UI's local state therefore never reflects the server's metadata mutations. On the next /reply request, conversation_so_far carries the stale pre-mutation state, and reply.rs calls replace_conversation unconditionally, overwriting the server's changes.
Proposed Fix
Immediate (low risk): After applying tool-pair summarization changes in agent.rs, emit AgentEvent::HistoryReplaced(conversation.clone()). This is identical to how full compaction notifies the UI. The UI will refresh its local state and subsequent conversation_so_far payloads will be accurate.
Follow-up (tracked separately): The conversation_so_far ownership model is inherently fragile — the client should not be the source of truth for server-managed state. Consider a server-authoritative reply contract where the client can only append new messages, not replace existing ones.
References
- Full compaction emits correctly:
agent.rs~L1034 - Recovery compaction emits correctly:
agent.rs~L1493 - Slash command compaction emits correctly:
agent.rs~L951 - Tool-pair summarization:
agent.rs~L1580–1608 (missing emission) AgentEvent::HistoryReplacedhandling in UI:useChatStream.ts~L294