-
Notifications
You must be signed in to change notification settings - Fork 2
research(memory): track trajectory elongation from hard compaction (summarization cost metric) #1739
Description
Background
JetBrains Research (Dec 2025) studied context management in production AI coding agents and found:
LLM summarization caused agents to run 13-15% longer on average because summaries may 'smooth over' failure signals that normally trigger stopping conditions, ironically increasing total costs.
Observation masking (replacing old tool results with placeholders) cut costs >50% vs. unmanaged context while outperforming summarization in terms of both task completion and total token usage.
Source: Efficient Context Management for AI Agents (JetBrains Research, Dec 2025)
Problem
Zeph has no instrumentation to measure whether hard compaction (LLM summarization) causes trajectory elongation — i.e., does the agent take more turns / make more tool calls after a hard compaction event than it would have otherwise?
Proposal
Add metrics to track per-session:
compaction_hard_count: number of hard compaction events in sessionturns_after_hard_compaction: tool iterations used after each hard compaction- Log these at session end via
tracing::info!inagent_shutdown
This would allow detecting the trajectory elongation pattern in real sessions (visible in .local/testing/debug/session.log).
Secondary: if elongation is confirmed, consider increasing soft compaction frequency (lower soft_compaction_threshold) to reduce reliance on hard compaction.
Applicability
- Impact: Medium — enables data-driven tuning of compaction thresholds
- Complexity: Low — counter fields in
AgentMetrics, log at shutdown - Risk: None — metrics-only change
References
- JetBrains: https://blog.jetbrains.com/research/2025/12/efficient-context-management/
- Related: Zeph issue Tiered context compaction: soft at 70%, hard at 90% #1338 (tiered compaction), bug(core): context compaction loop when budget too tight — each turn compacts, 400 on oversized context #1708 (compaction loop fix)