-
Notifications
You must be signed in to change notification settings - Fork 2
research(context): task-aware self-adaptive context pruning (SWE-Pruner) #1851
Description
Research Finding
Sources:
- SWE-Pruner — arXiv 2601.16746 (Jan 2026): task-aware self-adaptive context pruning, 40% API cost reduction with no accuracy regression on SWE-bench
- COMI — arXiv 2602.01719 (Feb 2026): coarse-to-fine compression via Marginal Information Gain
Problem
Zeph's compaction currently prunes by recency + token count (oldest-first). Both papers propose principled scoring functions to decide which context chunks to retain — they are complementary approaches to the same problem and can be evaluated/combined.
Algorithm A: Task-Goal Guided Pruning (SWE-Pruner)
Before each hard compaction, emit a lightweight LLM call (~50 tokens): "Given the current task goal, summarize in one sentence what context is most important to preserve."
Use the extracted goal string as a relevance signal:
- Score each tool-output block against it via cosine similarity (embedding already available) or keyword overlap
- Prune lowest-scoring blocks first, rather than oldest-first
Config: [memory.compression] strategy = "task_aware" (default remains "reactive").
Complexity: MEDIUM — one extra LLM call per compaction event; embedding scoring reuses existing infrastructure.
Algorithm B: Marginal Information Gain (COMI)
Two-stage scoring replacing oldest-first:
MIG = relevance(unit, query) − redundancy(unit, already-selected-units)
- Coarse: partition context into groups (by type: system/user/assistant/tool, or temporal window) → compute inter-group MIG → allocate compaction budget proportionally (low-relevance groups compressed more)
- Fine: within each group, greedily select units by per-unit MIG until budget exhausted
Particularly effective when recent messages are redundant (repeated tool calls).
Synergy with existing work:
- Complements research(routing): KVzip importance scoring as principled heuristic for compaction pruning #1824 (KVzip importance scoring) — KVzip at attention-key level, COMI at token-group level
- Complements Algorithm A — COMI provides structural redundancy signal, SWE-Pruner provides task-goal relevance signal; they can be combined (MIG = task_relevance − redundancy)
Cost: requires embedding calls for relevance scoring; can reuse Qdrant embeddings already computed per turn.
Config: [memory.compression] strategy = "mig".
Complexity: MEDIUM-HIGH — group partitioning + MIG scoring + budget allocation loop.
Implementation Plan
- Define
CompactionStrategyenum:Reactive | TaskAware | Mig | TaskAwareMig(combined) - Start with Algorithm A (simpler, single LLM call) — validate quality improvement vs baseline
- Add Algorithm B as optional upgrade or combine:
strategy = "task_aware_mig" - Emit pruning goal and scores at DEBUG for observability
Integration Points
zeph-core:ContextManager::apply_soft_compaction(),HardCompactorzeph-memory:SemanticMemory::compress()- Config:
[memory.compression] strategy - Debug dump: include pruning goal, per-block scores, strategy used
See Also
- research: structured anchored summarization for context compression #1607 (structured anchored summarization — output quality after pruning)
- research(routing): KVzip importance scoring as principled heuristic for compaction pruning #1824 (KVzip importance scoring — complementary importance signal)
- research(context): EDU-based structured context compression (LingoEDU) #1863 (LingoEDU EDU decomposition — pre-processing step before scoring)
- research(context): active context compression via agent-controlled focus primitives (Focus Agent) #1850 (Focus Agent — agent-controlled proactive compression)
References
- arXiv 2601.16746 (SWE-Pruner)
- arXiv 2602.01719 (COMI — Jiwei Tang et al., Tsinghua/Alibaba, Feb 2026)