Skip to content

feat(llm): Claude server-side compaction API (compact-2026-01-12 beta)#1696

Merged
bug-ops merged 5 commits intomainfrom
research-llm-claude-server-side
Mar 13, 2026
Merged

feat(llm): Claude server-side compaction API (compact-2026-01-12 beta)#1696
bug-ops merged 5 commits intomainfrom
research-llm-claude-server-side

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Mar 13, 2026

Closes #1626

Summary

Integrates Anthropic's server-side Compaction API (compact-2026-01-12 beta) into Zeph's Claude provider. When enabled, the API automatically summarizes conversation history when tokens approach a threshold, eliminating the need for client-side summarization LLM calls for Claude sessions.

  • context_management field added to all Claude request bodies with configurable trigger_tokens
  • Stateful SSE parser accumulates multi-event compaction content blocks → StreamChunk::Compaction
  • Agent loop prunes old messages and inserts sanitized compaction summary on compaction event
  • Safety fallback: client-side compaction still runs at 95% token threshold if server compaction does not fire
  • Emergency path: compact_context() always runs client-side regardless of server_compaction_active
  • ACP/daemon paths wired through SharedAgentDeps
  • TUI: [SC: N] status bar indicator, Compacting context (server-side)... spinner, /compaction:status command
  • Config: [llm.cloud] server_compaction = false (default off — beta API)
  • CLI: --server-compaction flag; --init wizard prompt
  • 25 new unit tests covering SSE state machine, request/response serialization, config parsing, agent loop

Test plan

  • cargo +nightly fmt --check — clean
  • cargo clippy --workspace --features full -- -D warnings — 0 warnings
  • cargo nextest run --config-file .github/nextest.toml --workspace --features full --lib --bins — 5297 passed
  • Test with cargo run --features full -- --server-compaction --config .local/config/testing.toml and a long conversation to verify compaction event is received and context is pruned

bug-ops added 2 commits March 13, 2026 21:58
- Add `ContextManagement` request field and `compact-2026-01-12` beta header
  to all Claude request bodies when `server_compaction = true`
- Make SSE parser stateful with `ClaudeSseState` (via async-stream) to
  accumulate multi-event `compaction` content blocks
- Add `StreamChunk::Compaction` and `MessagePart::Compaction` variants for
  round-trip fidelity; `take_compaction_summary()` added to `LlmProvider` trait
- C2: agent loop (native + legacy streaming) prunes old messages and inserts
  synthetic `MessagePart::Compaction` assistant turn on compaction event
- S1: `maybe_compact` and `maybe_proactive_compress` early-return when
  `server_compaction_active` to avoid duplicate compaction
- Add `server_compaction_events` metric, `--server-compaction` CLI flag,
  `--init` wizard prompt, and `# server_compaction = false` in default.toml
- Extend `debug_dump`, `token_counter`, and `MetricsSnapshot` for new variant

Closes #1626
…eview

- CRIT-1: add 95% safety fallback in maybe_compact/maybe_proactive_compress;
  client-side compaction no longer unconditionally skipped when server
  compaction is active — fires only when tokens exceed 95% of budget
- SEC-COMPACT-01: sanitize compaction summary via ContentSanitizer
  (McpResponse/ExternalUntrusted) before inserting into context in
  both native and legacy tool execution paths
- SEC-COMPACT-02: cap SSE compaction accumulation buffer at 32 KiB;
  excess bytes discarded with a warning log
- IMP-3: wire server_compaction through ACP path — add server_compaction
  field to SharedAgentDeps, initialize from config, pass to agent builder
- IMP-4: TUI integration — [SC: N] status bar indicator when
  server_compaction_events > 0; Compacting context (server-side)...
  spinner in native and legacy paths; /compaction:status command palette
  entry with ServerCompactionStatus TuiCommand
- IMP-5 (already fixed in prior commit): context_window * 80 / 100
  integer arithmetic preserved
- Add ~25 unit tests across sse.rs, claude.rs, provider.rs, config/tests.rs
  covering compaction sequence, 32 KiB cap, serialization, beta header,
  take_compaction_summary, and TOML config parsing
@github-actions github-actions bot added enhancement New feature or request documentation Improvements or additions to documentation llm zeph-llm crate (Ollama, Claude) memory zeph-memory crate (SQLite) rust Rust code changes core zeph-core crate dependencies Dependency updates config Configuration file changes size/XL Extra large PR (500+ lines) and removed enhancement New feature or request labels Mar 13, 2026
…ture

Merges origin/main (Claude 1M context, #1649) into the server-side
compaction branch. Both features add fields to CloudLlmConfig,
beta_header(), and init wizard — resolved by keeping both.
@github-actions github-actions bot added the enhancement New feature or request label Mar 13, 2026
@bug-ops bug-ops merged commit 0fc6c75 into main Mar 13, 2026
15 checks passed
@bug-ops bug-ops deleted the research-llm-claude-server-side branch March 13, 2026 22:13
bug-ops added a commit that referenced this pull request Mar 13, 2026
…on-w

Resolve conflict in crates/zeph-llm/src/claude.rs after PR #1696
(Claude server-side compaction) was merged into main.

Our graceful-degradation additions (server_compaction_rejected Arc<AtomicBool>,
is_compact_beta_rejection, is_server_compaction_rejected, detection in all
request paths, SEC-COMPACT-03 tests) are preserved on top of main's state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuration file changes core zeph-core crate dependencies Dependency updates documentation Improvements or additions to documentation enhancement New feature or request llm zeph-llm crate (Ollama, Claude) memory zeph-memory crate (SQLite) rust Rust code changes size/XL Extra large PR (500+ lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

research(llm): Claude server-side Compaction API (compact-2026-01-12 beta)

1 participant