research(memory): just-in-time tool result retrieval — store large outputs, inject references (Manus/Anthropic pattern)

## Background

Two independent sources (Manus and Anthropic's context engineering guide, 2025) recommend the same pattern for managing large tool outputs:

1. **Manus**: Tool results are written to the filesystem and retrieved on-demand via `glob`/`grep`. The LLM context only contains file paths (references), not content. Content is fetched "just-in-time" when needed.

2. **Anthropic**: "Use 'just-in-time' strategies where agents maintain lightweight identifiers and dynamically retrieve information via tools — mirroring human cognition's reliance on external organization systems."

Sources:
- [Context Engineering in Manus](https://rlancemartin.github.io/2025/10/15/manus/)
- [Effective Context Engineering for AI Agents (Anthropic, 2025)](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)

## Problem

Zeph already offloads very large tool outputs to disk (see `tools.overflow`). But this is a _reactive_ mechanism triggered only when output exceeds `overflow.threshold` (default 50KB).

The Manus pattern is _proactive_: replace tool results with references as context fills, and let the agent re-fetch as needed. This is partially implemented in Zeph's soft compaction (replace stale tool outputs with `[pruned]`), but:

1. The pruned content is gone — agent cannot re-fetch it
2. No reference/path is stored to point the agent back to the original content

## Proposal

When soft compaction prunes a tool output, if `tool_output_file` exists (overflow path already stored), replace content with a reference message:

```
[tool output pruned; full content at {path}]
```

This allows the agent to re-read the file if needed, rather than repeating the tool call blindly.

## Applicability

- **Impact**: Medium — reduces redundant tool calls after compaction, improves agent efficiency in long sessions
- **Complexity**: Low — modify `prune_tool_outputs()` to emit reference when overflow file exists
- **Risk**: Low — additive change; agents that don't need re-read are unaffected

## References

- Manus: https://rlancemartin.github.io/2025/10/15/manus/
- Anthropic: https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(memory): just-in-time tool result retrieval — store large outputs, inject references (Manus/Anthropic pattern) #1740

Background

Problem

Proposal

Applicability

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(memory): just-in-time tool result retrieval — store large outputs, inject references (Manus/Anthropic pattern) #1740

Description

Background

Problem

Proposal

Applicability

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions