feat(orchestration): topology-aware scheduling and GAP parallel annotations by bug-ops · Pull Request #2183 · bug-ops/zeph

bug-ops · 2026-03-26T23:25:57Z

Summary

Implements Phase 1 of two research issues in a single cohesive PR:

research(routing): AdaptOrch task-adaptive orchestration topology selection #1840 (AdaptOrch): TopologyClassifier inspects TaskGraph structure (edge count, longest path, fan-out ratio) and classifies it as AllParallel, LinearChain, FanOut, or Mixed. When topology_selection = true, DagScheduler uses the hint to cap max_parallel appropriately — applied in both new() and resume_from().
research: GAP graph-based parallel tool scheduling for LlmPlanner/DagScheduler #2172 (GAP): LlmPlanner prompt updated to request explicit execution_mode (parallel/sequential) annotation per task node. DagScheduler respects Sequential mode with global serialization (at most one sequential task at a time), leaving parallel tasks unaffected.

Both features default to off (topology_selection = false).

Changes

crates/zeph-orchestration/src/topology.rs — new: Topology enum, TopologyClassifier
crates/zeph-orchestration/src/graph.rs — ExecutionMode enum, execution_mode field on TaskNode (#[serde(default)] for backward-compat)
crates/zeph-orchestration/src/planner.rs — execution_mode annotation in prompt + typed Option<ExecutionMode> on PlannedTask
crates/zeph-orchestration/src/scheduler.rs — topology classification + sequential dispatch
crates/zeph-config/src/experiment.rs — topology_selection: bool field on OrchestrationConfig

Test plan

+27 tests (6504 → 6531), all passing
TopologyClassifier: all 4 classes, edge cases (empty, single-node, max_parallel=1)
ExecutionMode serde: roundtrip + missing-field backward-compat
Scheduler: topology hint applied, resume_from preserves classification, sequential dispatch serializes correctly while parallel tasks proceed
Pre-commit: cargo +nightly fmt --check, cargo clippy --all-targets --all-features --workspace -- -D warnings, cargo nextest run all pass

Closes #1840, closes #2172

…ations (#1840, #2172) Phase 1 of two research issues: (edge count, longest path, fan-out ratio) and classifies it as AllParallel, LinearChain, FanOut, or Mixed. When topology_selection=true in OrchestrationConfig, DagScheduler uses the topology hint to override max_parallel (capped at user config, never below 1). Also applied in resume_from() to preserve classification across suspension. (Parallel/Sequential) annotations per task node. Planner prompt now requests structured parallel/sequential markers; convert_response parses typed ExecutionMode with Parallel fallback on invalid/missing values. DagScheduler respects Sequential mode by serializing sequential tasks globally (at most one runs at a time), leaving parallel tasks unaffected. Both features are disabled by default (topology_selection = false). +27 tests (6504 -> 6531), all passing.

#2172) Follow-up to #2183 per multi-model design principle: LlmPlanner now accepts a dedicated provider reference instead of always using the primary provider. Config field `planner_provider = "fast"` in `[orchestration]` references a `[[llm.providers]]` entry by name; empty string falls back to the primary provider. Field replaces the dead `planner_model: Option<String>` (never wired, pre-v1.0.0 removal). `build_planner_provider()` in bootstrap resolves the named provider and passes it to `LlmPlanner::new()` at all three wiring sites (daemon, runner, acp). `OrchestrationState.planner_provider` holds the optional resolved provider, scoped to orchestration rather than the global ProviderState. `migrate_planner_model_to_provider()` auto-migration step comments out any existing `planner_model` value with a MIGRATED marker so users know to supply a provider name instead of a model name. Also adds latency-threshold doc comment on `TopologyClassifier::suggest_max_parallel` (parallel scheduling overhead only pays off when individual tool calls take >= 500ms; for LLM API calls this is always satisfied; for local tools such as file reads, consider keeping topology_selection disabled). +50 tests (6504 -> 6554).

#2184) * feat(orchestration): add planner_provider field to OrchestrationConfig (#2172) Follow-up to #2183 per multi-model design principle: LlmPlanner now accepts a dedicated provider reference instead of always using the primary provider. Config field `planner_provider = "fast"` in `[orchestration]` references a `[[llm.providers]]` entry by name; empty string falls back to the primary provider. Field replaces the dead `planner_model: Option<String>` (never wired, pre-v1.0.0 removal). `build_planner_provider()` in bootstrap resolves the named provider and passes it to `LlmPlanner::new()` at all three wiring sites (daemon, runner, acp). `OrchestrationState.planner_provider` holds the optional resolved provider, scoped to orchestration rather than the global ProviderState. `migrate_planner_model_to_provider()` auto-migration step comments out any existing `planner_model` value with a MIGRATED marker so users know to supply a provider name instead of a model name. Also adds latency-threshold doc comment on `TopologyClassifier::suggest_max_parallel` (parallel scheduling overhead only pays off when individual tool calls take >= 500ms; for LLM API calls this is always satisfied; for local tools such as file reads, consider keeping topology_selection disabled). +50 tests (6504 -> 6554). * style: fix fmt in migrate.rs test assertion

github-actions bot added enhancement New feature or request documentation Improvements or additions to documentation rust Rust code changes size/XL Extra large PR (500+ lines) labels Mar 26, 2026

bug-ops enabled auto-merge (squash) March 26, 2026 23:29

bug-ops force-pushed the adaptorch-topology-selection branch from 129b262 to 8043d35 Compare March 26, 2026 23:34

bug-ops merged commit 50f2460 into main Mar 26, 2026
25 checks passed

bug-ops deleted the adaptorch-topology-selection branch March 26, 2026 23:41

bug-ops mentioned this pull request Mar 27, 2026

research: GAP graph-based parallel tool scheduling for LlmPlanner/DagScheduler #2172

Closed

bug-ops mentioned this pull request Mar 27, 2026

feat(orchestration): add planner_provider field to OrchestrationConfig #2184

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(orchestration): topology-aware scheduling and GAP parallel annotations#2183

feat(orchestration): topology-aware scheduling and GAP parallel annotations#2183
bug-ops merged 1 commit intomainfrom
adaptorch-topology-selection

bug-ops commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 26, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant