feat(llm): Claude server-side compaction API (compact-2026-01-12 beta)#1696
Merged
feat(llm): Claude server-side compaction API (compact-2026-01-12 beta)#1696
Conversation
- Add `ContextManagement` request field and `compact-2026-01-12` beta header to all Claude request bodies when `server_compaction = true` - Make SSE parser stateful with `ClaudeSseState` (via async-stream) to accumulate multi-event `compaction` content blocks - Add `StreamChunk::Compaction` and `MessagePart::Compaction` variants for round-trip fidelity; `take_compaction_summary()` added to `LlmProvider` trait - C2: agent loop (native + legacy streaming) prunes old messages and inserts synthetic `MessagePart::Compaction` assistant turn on compaction event - S1: `maybe_compact` and `maybe_proactive_compress` early-return when `server_compaction_active` to avoid duplicate compaction - Add `server_compaction_events` metric, `--server-compaction` CLI flag, `--init` wizard prompt, and `# server_compaction = false` in default.toml - Extend `debug_dump`, `token_counter`, and `MetricsSnapshot` for new variant Closes #1626
…eview - CRIT-1: add 95% safety fallback in maybe_compact/maybe_proactive_compress; client-side compaction no longer unconditionally skipped when server compaction is active — fires only when tokens exceed 95% of budget - SEC-COMPACT-01: sanitize compaction summary via ContentSanitizer (McpResponse/ExternalUntrusted) before inserting into context in both native and legacy tool execution paths - SEC-COMPACT-02: cap SSE compaction accumulation buffer at 32 KiB; excess bytes discarded with a warning log - IMP-3: wire server_compaction through ACP path — add server_compaction field to SharedAgentDeps, initialize from config, pass to agent builder - IMP-4: TUI integration — [SC: N] status bar indicator when server_compaction_events > 0; Compacting context (server-side)... spinner in native and legacy paths; /compaction:status command palette entry with ServerCompactionStatus TuiCommand - IMP-5 (already fixed in prior commit): context_window * 80 / 100 integer arithmetic preserved - Add ~25 unit tests across sse.rs, claude.rs, provider.rs, config/tests.rs covering compaction sequence, 32 KiB cap, serialization, beta header, take_compaction_summary, and TOML config parsing
This was referenced Mar 13, 2026
test(llm): add missing compaction unit tests (parse_tool_response + split_messages round-trip)
#1697
Closed
…ture Merges origin/main (Claude 1M context, #1649) into the server-side compaction branch. Both features add fields to CloudLlmConfig, beta_header(), and init wizard — resolved by keeping both.
4 tasks
bug-ops
added a commit
that referenced
this pull request
Mar 13, 2026
…on-w Resolve conflict in crates/zeph-llm/src/claude.rs after PR #1696 (Claude server-side compaction) was merged into main. Our graceful-degradation additions (server_compaction_rejected Arc<AtomicBool>, is_compact_beta_rejection, is_server_compaction_rejected, detection in all request paths, SEC-COMPACT-03 tests) are preserved on top of main's state.
This was referenced Mar 13, 2026
Closed
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1626
Summary
Integrates Anthropic's server-side Compaction API (
compact-2026-01-12beta) into Zeph's Claude provider. When enabled, the API automatically summarizes conversation history when tokens approach a threshold, eliminating the need for client-side summarization LLM calls for Claude sessions.context_managementfield added to all Claude request bodies with configurabletrigger_tokenscompactioncontent blocks →StreamChunk::Compactioncompact_context()always runs client-side regardless ofserver_compaction_activeSharedAgentDeps[SC: N]status bar indicator,Compacting context (server-side)...spinner,/compaction:statuscommand[llm.cloud] server_compaction = false(default off — beta API)--server-compactionflag;--initwizard promptTest plan
cargo +nightly fmt --check— cleancargo clippy --workspace --features full -- -D warnings— 0 warningscargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins— 5297 passedcargo run --features full -- --server-compaction --config .local/config/testing.tomland a long conversation to verify compaction event is received and context is pruned