Skip to content

research(orchestration): AgentRM OS-inspired scheduler — MLFQ lanes, zombie reaping, context hibernation, 86% P95 latency reduction (arXiv:2603.13110) #2334

@bug-ops

Description

@bug-ops

Source

arXiv:2603.13110 — AgentRM: An OS-Inspired Resource Manager for LLM Agent Systems (March 2026)

Technique

Applies classic OS scheduling theory to LLM agent systems in two components:

1. MLFQ Task Scheduler

  • Multi-level feedback queue with priority lanes (interactive / batch / background)
  • Rate-limit-aware admission control: tasks entering at high rate get demoted to avoid API throttle cascades
  • Zombie reaping: detects and reaps stalled sub-agents to free scheduler slots
  • Reduces P95 latency by 86% in multi-agent benchmarks

2. Three-Tier Context Lifecycle Manager (CLM)

  • Tier 0 (Hot): active working memory — full fidelity, fast access
  • Tier 1 (Warm): recent context — adaptive compaction applied
  • Tier 2 (Cold): hibernated context — serialized to storage, recalled on demand
  • Achieves 100% key-information retention vs. 65% for sliding-window baselines

Applicability to Zeph

High. Two independent components, both directly mappable:

  • DagScheduler + zeph-scheduler would benefit from MLFQ priority lanes: /plan tasks as interactive, background compaction as batch, sweep timers as background. Zombie reaping maps to task_timeout_secs + explicit task cancellation.
  • Zeph's context builder already has soft/hard compaction tiers; adding a hibernation tier (Tier 2) for very old sessions aligns with the CLM model. The deferred_apply_threshold mechanism is the admission gate for Tier 1→0 promotion.

Differs from #2320 (AdaptOrch) which focuses on topology selection (parallel/sequential/hierarchical DAG shapes); AgentRM focuses on resource management within a fixed topology.

Implementation sketch

  1. Add priority field to TaskKind or PlanTask in zeph-orchestration
  2. Implement admission control in DagScheduler::tick() based on current rate vs. configured limit
  3. Add zombie detection: tasks exceeding task_timeout_secs get canceled and logged
  4. Context hibernation: add hibernated tier below cold in memory tier config with SQLite serialization gate

P2 — research, evaluate MLFQ fit with current DAG scheduler before implementation.

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityresearchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions