Tool-pair summarization creates hidden messages that are silently deleted on next client round-trip

## Summary

When tool-pair summarization fires, it creates server-side hidden messages (`userVisible: false, agentVisible: true`) and mutates prior message metadata without emitting a `HistoryReplaced` event. The UI never learns about these changes. On the next user turn, the UI sends `conversation_so_far` containing its stale copy of the conversation, and the server calls `replace_conversation` with it — permanently deleting the hidden summaries and potentially reverting metadata changes the server just made.

This is in contrast to full compaction (`compact_messages`), which correctly emits `HistoryReplaced` and keeps the UI in sync.

## Affected Code

- `crates/goose/src/agents/agent.rs` — tool-pair summarization block (~L1580–1608): calls `update_message_metadata` and `add_message` but never emits `AgentEvent::HistoryReplaced`
- `crates/goose-server/src/routes/reply.rs` (~L288–305): unconditionally calls `replace_conversation` with client-supplied `conversation_so_far`
- `crates/goose/src/agents/agent.rs` (~L951, ~L1034, ~L1493): the three places where `HistoryReplaced` IS emitted — tool-pair summarization is the missing fourth

## Reproduction

**Setup:** Set `GOOSE_TOOL_CALL_CUTOFF: 3` in `~/.config/goose/config.yaml` to make summarization trigger quickly.

1. Start a fresh session
2. Send a message that forces chained tool calls:
   > "List the files in /tmp, read the 3 most recently modified ones, and tell me what you find"
3. After the response, query the session DB:
   ```sql
   SELECT message_id, role, metadata_json
   FROM messages
   WHERE session_id = (SELECT id FROM sessions ORDER BY updated_at DESC LIMIT 1)
   ORDER BY created_timestamp;
   ```
   Observe: a new `userVisible: false, agentVisible: true` message (the hidden summary) and one or two messages now marked `agentVisible: false`.

4. Send any follow-up message ("thanks")
5. Query the DB again

**Expected:** hidden summary persists; `agentVisible: false` metadata preserved.

**Actual:** hidden summary is deleted; server had to re-run summarization from scratch. Other messages also dropped.

## Evidence

Captured from a live test session:

**After turn 1:**
```
msg_6df69722  assistant  agentVisible:false   ← summarized, hidden from AI
msg_7e3e673e  user       agentVisible:false   ← summarized, hidden from AI
msg_803f5a4f  user       userVisible:false, agentVisible:true  ← hidden summary
```

**After sending "thanks":**
```
msg_011qCnLc  assistant  agentVisible:false   ← different message ID, msg_6df69722 gone
msg_7e3e673e  user       agentVisible:false   ← preserved
msg_cc2f0950  user       userVisible:false, agentVisible:true  ← brand new summary
```

`msg_803f5a4f` is completely gone. The server recreated the work it just did.

## Impact

- Tool-pair summarization silently undoes itself every turn
- Agent context is not compressed as intended — tool pairs the system tried to retire may re-enter the context window
- In long sessions where stale tool results (e.g. failed extension checks) were being compressed away, those results can persist in context longer than expected, contributing to stale reasoning loops
- Severity: medium. No data loss visible to the user; the system partially self-heals by re-running summarization. But the intended token economy and context hygiene of tool-pair summarization is effectively broken.

## Root Cause

`maybe_summarize_tool_pair` in `agent.rs` calls `update_message_metadata` and `session_manager.add_message` but does not emit `AgentEvent::HistoryReplaced`. The UI's local state therefore never reflects the server's metadata mutations. On the next `/reply` request, `conversation_so_far` carries the stale pre-mutation state, and `reply.rs` calls `replace_conversation` unconditionally, overwriting the server's changes.

## Proposed Fix

**Immediate (low risk):** After applying tool-pair summarization changes in `agent.rs`, emit `AgentEvent::HistoryReplaced(conversation.clone())`. This is identical to how full compaction notifies the UI. The UI will refresh its local state and subsequent `conversation_so_far` payloads will be accurate.

**Follow-up (tracked separately):** The `conversation_so_far` ownership model is inherently fragile — the client should not be the source of truth for server-managed state. Consider a server-authoritative reply contract where the client can only append new messages, not replace existing ones.

## References

- Full compaction emits correctly: `agent.rs` ~L1034
- Recovery compaction emits correctly: `agent.rs` ~L1493
- Slash command compaction emits correctly: `agent.rs` ~L951
- Tool-pair summarization: `agent.rs` ~L1580–1608 (missing emission)
- `AgentEvent::HistoryReplaced` handling in UI: `useChatStream.ts` ~L294

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool-pair summarization creates hidden messages that are silently deleted on next client round-trip #7413

Summary

Affected Code

Reproduction

Evidence

Impact

Root Cause

Proposed Fix

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tool-pair summarization creates hidden messages that are silently deleted on next client round-trip #7413

Description

Summary

Affected Code

Reproduction

Evidence

Impact

Root Cause

Proposed Fix

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions