-
Notifications
You must be signed in to change notification settings - Fork 2
fix(context-compression): extract_task_goal blocks Soft tier for 5s on every compaction — task_aware/mig strategies effectively broken for cloud LLMs #1909
Copy link
Copy link
Closed
Labels
bugSomething isn't workingSomething isn't workingllmzeph-llm crate (Ollama, Claude)zeph-llm crate (Ollama, Claude)
Description
Summary
When pruning_strategy = "task_aware" or "mig" is configured, maybe_refresh_task_goal() is called synchronously during every Soft tier compaction event. It calls extract_task_goal() which makes a blocking LLM call with a hardcoded 5-second timeout.
Observed behavior (Session 113, 2026-03-16, v0.15.2)
Config: context_budget_tokens=12000, soft_compaction_threshold=0.55, pruning_strategy="task_aware", provider=openai/gpt-5-mini.
Log evidence:
extract_task_goal: timed out ← fires on EVERY Soft tier event (2/2 in test)
soft compaction complete cached_tokens=6755 > soft_threshold=6600 (still above threshold)
- Each timeout adds exactly 5s of user-visible latency
- With goal=None, falls back to
prune_tool_outputs_oldest_first()— task_aware scoring never runs - Result:
pruning_strategy = "task_aware"is functionally identical to"reactive"when using cloud APIs
Root cause
extract_task_goal() at summarization.rs:1954:
- Hardcoded 5s timeout (
std::time::Duration::from_secs(5)) - Calls
summary_or_primary_provider().chat()— same cloud LLM used for main inference - Typical OpenAI gpt-5-mini latency exceeds 5s during concurrent inference sessions
Impact
- User-visible: 5s latency spike on every Soft tier compaction event
- Functional: task_aware / mig / task_aware_mig strategies are non-functional for cloud providers
- Only local fast models (Ollama) where latency < 5s would work
Proposed fix
Make goal extraction fire-and-forget: spawn a background task to update current_task_goal between turns and use the previous turn's goal immediately (no blocking wait). This eliminates latency spikes entirely.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingllmzeph-llm crate (Ollama, Claude)zeph-llm crate (Ollama, Claude)