Skip to content

fix(context-compression): extract_task_goal blocks Soft tier for 5s on every compaction — task_aware/mig strategies effectively broken for cloud LLMs #1909

@bug-ops

Description

@bug-ops

Summary

When pruning_strategy = "task_aware" or "mig" is configured, maybe_refresh_task_goal() is called synchronously during every Soft tier compaction event. It calls extract_task_goal() which makes a blocking LLM call with a hardcoded 5-second timeout.

Observed behavior (Session 113, 2026-03-16, v0.15.2)

Config: context_budget_tokens=12000, soft_compaction_threshold=0.55, pruning_strategy="task_aware", provider=openai/gpt-5-mini.

Log evidence:

extract_task_goal: timed out  ← fires on EVERY Soft tier event (2/2 in test)
soft compaction complete cached_tokens=6755 > soft_threshold=6600 (still above threshold)
  • Each timeout adds exactly 5s of user-visible latency
  • With goal=None, falls back to prune_tool_outputs_oldest_first() — task_aware scoring never runs
  • Result: pruning_strategy = "task_aware" is functionally identical to "reactive" when using cloud APIs

Root cause

extract_task_goal() at summarization.rs:1954:

  • Hardcoded 5s timeout (std::time::Duration::from_secs(5))
  • Calls summary_or_primary_provider().chat() — same cloud LLM used for main inference
  • Typical OpenAI gpt-5-mini latency exceeds 5s during concurrent inference sessions

Impact

  • User-visible: 5s latency spike on every Soft tier compaction event
  • Functional: task_aware / mig / task_aware_mig strategies are non-functional for cloud providers
  • Only local fast models (Ollama) where latency < 5s would work

Proposed fix

Make goal extraction fire-and-forget: spawn a background task to update current_task_goal between turns and use the previous turn's goal immediately (no blocking wait). This eliminates latency spikes entirely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingllmzeph-llm crate (Ollama, Claude)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions