-
Notifications
You must be signed in to change notification settings - Fork 2
feat(orchestration): Phase 3 — DAG Scheduler + SubAgentManager integration #1238
Copy link
Copy link
Closed
Labels
enhancementNew feature or requestNew feature or requestorchestrationTask orchestration / DAG schedulingTask orchestration / DAG scheduling
Description
Parent: #1235
Blocked by: #1236
Summary
Implement the DAG scheduler that drives parallel task execution through existing SubAgentManager infrastructure.
Branch: feat/m33/orchestration-scheduler
Deliverables
New files
crates/zeph-core/src/orchestration/scheduler.rs—DagScheduler,SchedulerAction,TaskEvent,TaskOutcomecrates/zeph-core/src/orchestration/router.rs—AgentRoutertrait +RuleBasedRouterimpl
Modified files
crates/zeph-core/src/subagent/manager.rs— addcompletion_tx: Option<mpsc::Sender<TaskEvent>>toSubAgentHandle, fire on loop termination. Addspawn_for_task()method.
Key design decisions
- ADR-026: Command pattern — scheduler produces
SchedulerActionvalues, caller executes against manager. No&mut SubAgentManagerheld across await points. - ADR-027: Single
mpsc::Sender<TaskEvent>channel for all agent completions (not per-agent watch receivers). - Cross-task context:
<completed-dependencies>block injected into task prompt withdependency_context_budget(16384 chars total, divided equally across deps). - Task timeout: wall-clock monitoring via
task_timeout_secsconfig. - Concurrency: respects both
max_parallelandSubAgentManager.max_concurrent.
Scheduler tick loop
- Drain pending
TaskEvents fromevent_rx - Update task statuses, call
ready_tasks()for newly schedulable tasks - Route ready tasks to agents via
AgentRouter - Emit
SchedulerAction::Spawnfor each (up to concurrency limit) - If no running/ready tasks and all terminal →
Done
Router fallback chain
task.agent_hintexact match against loaded definitions- Tool requirement matching (future: task keywords → agent tool policy)
- First available agent (fallback)
Tests (~20)
- Linear chain executes sequentially (via mock manager)
- Independent tasks produce parallel Spawn actions
- Completion triggers downstream scheduling
- Failure + Abort produces Cancel actions for all running tasks
- Failure + Skip propagates correctly through dependents
- Failure + Retry re-schedules task (resets to Ready)
- Concurrency limit respected in action generation
- All tasks completed → Done with Completed status
- Router: agent_hint exact match
- Router: fallback to first available
- Task timeout produces Cancel action
- Manager spawn rejection (concurrency) keeps task Ready
Dependencies
Blocked by: Phase 1 (#1236)
Can run in parallel with: Phase 2 (#1237)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestorchestrationTask orchestration / DAG schedulingTask orchestration / DAG scheduling