feat(orchestration): Phase 5 — Aggregator + resume/retry (#1240) by bug-ops · Pull Request #1258 · bug-ops/zeph

bug-ops · 2026-03-06T02:25:15Z

Summary

Add Aggregator trait and LlmAggregator: synthesizes completed task outputs into a single coherent response via LLM call with per-task character budget, ContentSanitizer spotlighting, skipped-task descriptions, and raw-concatenation fallback on LLM failure
Add DagScheduler::resume_from(): resumes Paused/Failed graphs, reconstructs running-task map, sets status to Running
Add dag::reset_for_retry(): BFS reset of Failed→Ready and Skipped/Canceled→Pending for targeted graph re-runs
Extend PlanCommand with Resume(Option<String>) and Retry(Option<String>) variants (with graph-id validation)
Wire /plan resume and /plan retry in agent loop; call aggregator on graph completion
Add aggregator_max_tokens to OrchestrationConfig (default: 4096)
+8 tests: resume_from() coverage (6) and reset_for_retry() with Canceled handling (2)

Test plan

cargo +nightly fmt --check — clean
cargo clippy --workspace --features full -- -D warnings — clean
cargo nextest run --workspace --features full --lib --bins — 4297 pass (1 pre-existing failure in bootstrap::tests::create_skill_matcher_when_semantic_disabled on VersionMissing(23), present on main before this branch)
Security: build_fallback() sanitizes via ContentSanitizer (SEC-P5-02)
resume_from() reconstructs running map from Running tasks in graph (IC1)
reset_for_retry() resets Canceled→Pending in addition to Failed→Ready (IC2)
handle_plan_retry counts failed tasks before reset, not after (IC3)
Docs updated: task-orchestration.md, configuration.md, README.md, crates/zeph-core/README.md

Closes #1240. Part of epic #1235.

Add result aggregation and graph lifecycle management for the task orchestration system. ### New - `aggregator.rs`: `Aggregator` trait + `LlmAggregator` implementation. Synthesizes completed task outputs via a single LLM call with per-task character budget (`aggregator_max_tokens / num_completed_tasks`), `ContentSanitizer` spotlighting on all task outputs, skipped-task descriptions, and raw-concatenation fallback on LLM failure. - `dag::reset_for_retry()`: BFS-based reset — `Failed`→`Ready`, `Skipped`/`Canceled`→`Pending`. Enables targeted re-runs without discarding completed work. - `DagScheduler::resume_from()`: accepts `Paused`/`Failed` graphs, reconstructs `running` HashMap from in-flight tasks, sets `graph.status = Running`. ### Extended - `PlanCommand`: add `Resume(Option<String>)` and `Retry(Option<String>)` variants with graph-id validation. - `OrchestrationConfig`: add `aggregator_max_tokens` field (default 4096). - Agent loop: wire `handle_plan_resume`, `handle_plan_retry`, call aggregator on graph completion. - Dropped event guard upgraded to `error!`-level logging (PERF-SCHED-02). ### Tests (+8) - 6 tests for `DagScheduler::resume_from()` (MT-1/II4 fix) - 2 tests for `reset_for_retry()` with `Canceled` task handling (IC2 fix) ### Docs - `docs/src/concepts/task-orchestration.md`: Result Aggregation section, `/plan resume`/`/plan retry` documentation. - `docs/src/reference/configuration.md`: `dependency_context_budget`, `confirm_before_execute`, `aggregator_max_tokens` fields. - `README.md` and `crates/zeph-core/README.md` updated. Closes #1240. Part of epic #1235.

github-actions bot added documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 6, 2026

Merge remote-tracking branch 'origin/main' into orchestration-aggregator

9a91f91

bug-ops enabled auto-merge (squash) March 6, 2026 02:29

bug-ops merged commit c3ff145 into main Mar 6, 2026
28 checks passed

bug-ops deleted the orchestration-aggregator branch March 6, 2026 02:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestration): Phase 5 — Aggregator + resume/retry (#1240)#1258

feat(orchestration): Phase 5 — Aggregator + resume/retry (#1240)#1258
bug-ops merged 2 commits intomainfrom
orchestration-aggregator

bug-ops commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 6, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant