feat(orchestration): activate VMAO verify-completeness in PlanVerifier by bug-ops · Pull Request #2346 · bug-ops/zeph

bug-ops · 2026-03-28T11:12:27Z

Summary

Activates the existing PlanVerifier skeleton (introduced in PR #2235) with a full per-task and whole-plan verification pipeline based on the VMAO paper (arXiv:2603.11445). When verify_completeness = true, the agent now runs an LLM-judged completeness check after DAG execution and triggers a targeted gap-filling replan cycle for incomplete sub-tasks.

Add completeness_threshold: f32 (default 0.7) to OrchestrationConfig with sanitizer clamping and startup validation
Add verify_plan() for whole-plan verification after DagScheduler completes
Add replan_from_plan() generating root TaskNodes for plan-level gaps
Wire SchedulerAction::Verify handler in agent loop (was no-op since feat(orchestration): AdaptOrch topology-routing and Plan-Execute-Verify-Replan (#2219, #2202) #2235)
Insert whole-plan verify step between Done{Completed} and aggregation
Partial replan DAG runs in a separate DagScheduler with max_replans=0 and verify_completeness=false (INV-2: no recursive loops)
Partial DAG outputs merged with original task outputs before Aggregator
Output truncated to verify_max_tokens * 4 chars before verify_plan() call
GapSeverity implements Display returning lowercase names consistent with serde snake_case and LLM system prompt expectations
All LLM error paths are fail-open (verify → complete=true, replan → empty Vec, whole-plan → None)
25 new unit tests across verifier.rs and experiment.rs
verify_completeness = false by default — no behavior change when disabled

Closes #2252

Research evaluations (in architect handoff

.local/handoff/2026-03-28T10-59-52-architect.yaml)

AdaptOrch research(orchestration): AdaptOrch topology routing — DAG-based selection of parallel/sequential/hierarchical patterns, 12-23% gain (arXiv:2602.16873) #2320: P3 backlog — TopologyClassifier already exists; content-aware topology routing needs a feedback loop that VMAO completeness scores can now provide
AgentRM research(orchestration): AgentRM OS-inspired scheduler — MLFQ lanes, zombie reaping, context hibernation, 86% P95 latency reduction (arXiv:2603.13110) #2334: MLFQ priority lanes P3, context hibernation P4 (separate zeph-memory scope)

Test plan

cargo +nightly fmt --check — passed
cargo clippy --workspace --features full -- -D warnings — 0 warnings
cargo nextest run --workspace --features experiments --lib --bins — 6581/6581 passed
Verify verify_completeness = false (default) produces no behavior change
Verify completeness_threshold outside [0.0, 1.0] rejected at startup

#2252) Activate the existing PlanVerifier skeleton with per-task and whole-plan verification gates, a completeness_threshold config field, and a single-cycle gap-filling replan loop based on the VMAO paper (arXiv:2603.11445). - Add completeness_threshold: f32 (default 0.7) to OrchestrationConfig with sanitizer clamping to [0.0, 1.0] and startup validation - Add verify_plan() for whole-plan verification after DagScheduler completion - Add replan_from_plan() generating root TaskNodes for plan-level gaps - Wire SchedulerAction::Verify handler in agent loop (was a no-op since #2235) - Add whole-plan verification step between Done{Completed} and aggregation - Partial replan DAG runs in a separate DagScheduler with max_replans=0 and verify_completeness=false to prevent recursive loops (INV-2) - Partial DAG outputs merged with original task outputs before aggregation - Output truncated to verify_max_tokens*4 chars before verify_plan() call - GapSeverity implements Display returning lowercase names consistent with serde snake_case serialization and LLM system prompt expectations - All LLM error paths are fail-open: verify returns complete=true, replan returns empty Vec, whole-plan verify returns None - 25 new unit tests across verifier.rs and experiment.rs - verify_completeness = false by default; no behavior change when disabled Closes #2252

github-actions bot added documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate enhancement New feature or request size/XL Extra large PR (500+ lines) labels Mar 28, 2026

bug-ops force-pushed the vmao-adaptive-replanning branch from e4085e0 to 3cffd78 Compare March 28, 2026 11:16

bug-ops merged commit ff649f1 into main Mar 28, 2026
25 checks passed

bug-ops deleted the vmao-adaptive-replanning branch March 28, 2026 11:26

bug-ops mentioned this pull request Mar 28, 2026

research(quality): MARCH information-asymmetry self-check — Solver+Proposer+Checker pipeline breaks confirmation bias (arXiv:2603.24579) #2352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestration): activate VMAO verify-completeness in PlanVerifier#2346

feat(orchestration): activate VMAO verify-completeness in PlanVerifier#2346
bug-ops merged 1 commit intomainfrom
vmao-adaptive-replanning

bug-ops commented Mar 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Research evaluations (in architect handoff

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bug-ops commented Mar 28, 2026 •

edited

Loading