Skip to content

Agents: add nested subagent orchestration controls and reduce subagent token waste#14447

Merged
tyler6204 merged 42 commits intomainfrom
codex/subagent-improvements
Feb 15, 2026
Merged

Agents: add nested subagent orchestration controls and reduce subagent token waste#14447
tyler6204 merged 42 commits intomainfrom
codex/subagent-improvements

Conversation

@tyler6204
Copy link
Member

@tyler6204 tyler6204 commented Feb 12, 2026

Summary

This PR adds first-class nested subagent orchestration for both users and models, improves subagent control UX/reliability, and reduces token waste during long-running shell waits.

What changed

  • Added a new model-callable subagents tool with actions:
    • list
    • steer
    • kill
  • Wired subagents into core tools, policy groups, sandbox default allowlists, system prompt guidance, and tool display metadata.
  • Added nested subagent controls:
    • agents.defaults.subagents.maxSpawnDepth (default behavior remains depth 1)
    • agents.defaults.subagents.maxChildrenPerAgent (per-parent active child cap)
    • session spawnDepth persistence + gateway patch/schema support
    • depth resolution from store/session ancestry for flat and nested keys
  • Added depth-aware subagent tool policy:
    • depth-1 orchestrators can manage children when maxSpawnDepth >= 2
    • leaf workers cannot spawn further children
  • Expanded command UX for subagents:
    • /subagents list|stop|kill|log|info|send|steer
    • /kill <id|#|all>
    • /steer <id|#> <message>
    • /tell <id|#> <message>
  • Updated command reservation to protect new built-in command names from plugin override.

Behavior updates

  • Steer now prioritizes immediate control:
    • always abort/clear queued work and restart with the steer message so guidance takes effect immediately.
  • Steer now preserves subagent conversation context across the restart path:
    • waits briefly for interrupted run settlement (agent.wait) before restart.
    • reuses the existing subagent sessionId so the steer message appends to the same conversation.
    • forces steered follow-up run timeout to 0 (no 10-minute default timeout fallback).
  • Steer restart state is now tracked in the subagent registry:
    • suppresses stale announce output from interrupted runs
    • replaces old run tracking with the new steer run for completion/cascade handling
  • Kill is now immediate and silent from slash-command UX (no confirmation reply), while still aborting and clearing queued session work.
  • Stop/kill now cascade to descendants (sub-sub-agents) in both command and runtime abort paths.
  • Numeric target resolution now follows displayed list order (active first, then recent) for:
    • /subagents stop|kill|steer
    • /kill, /steer, /tell
    • model subagents tool kill|steer
  • Subagent completion injections are now explicitly tagged as internal:
    • announce payloads now start with [System Message].
    • added guardrails so the model treats those blocks as not user-visible unless explicitly relayed.
    • prevents replies like “you can see it above” when only internal context was injected.
  • Nested announce routing now targets parent requester sessions correctly (internal injection for subagent parents, external delivery for user-facing parents).
  • Announces now skip completed requester subagent sessions to avoid wasted NO_REPLY turns.
  • Announce agent calls now use a shorter timeout to reduce queue-drain stall impact.

Reliability fixes (timeout/model/restart)

  • Subagent lane runs now default to timeout 0 in agent command handling when timeout is omitted.
  • Model refs now normalize openai/gpt-5.3-codex* to openai-codex/... so fallback/cooldown behavior keys match provider identity.
  • Gateway restart loop now attempts full process respawn (fresh PID) before falling back to in-process restart.
  • Restart sentinel output now emits a stable concise summary instead of volatile raw JSON.

Registry + test isolation fixes

  • Kill/abort flows now mark subagent runs terminated in registry state so killed runs stop showing as active.
  • Subagent registry persistence now uses an isolated temp state directory in tests when OPENCLAW_STATE_DIR is unset, preventing writes into real user state.

Token reporting fixes/clarity

  • Clarified subagent token display to avoid conflating prompt/cache/context-heavy totals with response I/O.
  • subagents list (tool + slash command) and announce stats now show usage as:
    • tokens X (in A / out B) for actual I/O, and
    • prompt/cache Y when prompt/cache/context cost is larger than I/O.
  • Removed confusing wording like tokens ... io (...).

Tooling optimization for long waits

  • Added process.poll timeout support (timeout in ms) so a single poll can wait for completion instead of repeated short poll loops.
  • Accepts numeric and string timeout values in tool args.
  • Added prompt guidance nudging models to prefer:
    • exec with sufficient yieldMs, or
    • process(action=poll, timeout=...)
      to reduce unnecessary polling turns and token churn.

Additional fixes bundled in this branch

  • Preserve explicit session model override behavior in fallback selection (avoid unintended fallback append).
  • Preserve spawnDepth when agent handler rewrites session entries.
  • Treat context overflow as compaction/retry concern instead of model fallback failover.
  • Avoid redundant overflow compaction when attempt-level auto-compaction already ran.
  • Always inject persistent group chat context each turn (not only first-turn intro).

Tests

  • Added/updated tests covering:
    • nested spawn depth and per-parent child limits
    • spawnDepth persistence/validation in sessions patch + gateway paths
    • nested announce routing and steer-restart suppression/replacement
    • cascade stop/kill for descendants
    • subagents tool schema and behavior
    • slash command aliases and behavior (/kill, /steer, /tell)
    • numeric target ordering consistency with displayed indices
    • slash command list usage formatting (tokens ... (in/out), prompt/cache ...)
    • policy/sandbox/system prompt integration for subagents
    • new process poll timeout behavior
    • subagent announce formatting and flow paths
    • steer restart ordering and context continuity assertions (agent.wait before restart, same sessionId, timeout 0)
    • openai codex model/provider normalization in model selection + fallback execution
    • gateway respawn helper and restart sentinel formatting
    • killed-run registry termination state
    • isolated subagent registry test state dir behavior

Validation run

  • pnpm vitest src/auto-reply/reply/commands.test.ts
  • pnpm vitest src/agents/openclaw-tools.sessions.test.ts
  • pnpm vitest src/agents/subagent-registry.steer-restart.test.ts
  • pnpm vitest src/agents/model-selection.test.ts src/agents/model-fallback.test.ts
  • pnpm vitest src/infra/restart-sentinel.test.ts src/infra/process-respawn.test.ts
  • pnpm vitest src/agents/subagent-registry.persistence.test.ts src/auto-reply/reply/abort.test.ts
  • pnpm check

Greptile Overview

Greptile Summary

This PR adds nested sub-agent orchestration controls (model subagents tool + new /kill, /steer, /tell command UX), introduces spawn-depth persistence/limits, and updates announce routing so descendant results can bubble correctly without leaking internal context.

Key codepaths touched include the subagent registry (termination + steer-restart suppression/replacement), announce flow (internal [System Message] tagging + depth-aware routing), model selection/fallback normalization, and the process tool (poll timeout wait support to reduce polling turns).

Confidence Score: 3/5

  • This PR is close to mergeable but has a couple of functional issues that can break intended behavior.
  • Main changes are cohesive and well-tested, but there are two verified logic issues: (1) a schema/handler mismatch in the process tool that will reject valid intended inputs for some clients, and (2) a deferred-announce path that is treated as completed, potentially dropping parent completion announcements permanently.
  • src/agents/subagent-announce.ts, src/agents/bash-tools.process.ts

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation app: web-ui App: web-ui gateway Gateway runtime cli CLI command changes commands Command implementations agents Agent runtime and tooling size: XL maintainer Maintainer-authored PR labels Feb 12, 2026
@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

1 similar comment
@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

91 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 42 to 46
timeout: Type.Optional(
Type.Number({
description: "For poll: wait up to this many milliseconds before returning",
}),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema/handler mismatch

processSchema declares timeout as a number (Type.Number), but the execute path accepts timeout?: number | string and resolvePollWaitMs explicitly parses string values. Any tool-arg schema validation will reject string timeouts even though they’re intended to work here, so callers passing "5000" will fail before reaching resolvePollWaitMs.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/bash-tools.process.ts
Line: 42:46

Comment:
**Schema/handler mismatch**

`processSchema` declares `timeout` as a number (`Type.Number`), but the execute path accepts `timeout?: number | string` and `resolvePollWaitMs` explicitly parses string values. Any tool-arg schema validation will reject string timeouts even though they’re intended to work here, so callers passing `"5000"` will fail before reaching `resolvePollWaitMs`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +444 to +456
let activeChildDescendantRuns = 0;
try {
const { countActiveDescendantRuns } = await import("./subagent-registry.js");
activeChildDescendantRuns = Math.max(0, countActiveDescendantRuns(params.childSessionKey));
} catch {
// Best-effort only; fall back to direct announce behavior when unavailable.
}
if (activeChildDescendantRuns > 0) {
// The finished run still has active descendant subagents. Defer announcing
// this run until descendants settle so we avoid posting in-progress updates.
shouldDeleteChildSession = false;
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred announce marked done

When activeChildDescendantRuns > 0, this path says it’s deferring the announce, but it sets didAnnounce = true and returns true. subagent-registry.finalizeSubagentCleanup treats a true return as “announce handled” and won’t retry, so the parent run’s completion can be permanently skipped once descendants finish.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-announce.ts
Line: 444:456

Comment:
**Deferred announce marked done**

When `activeChildDescendantRuns > 0`, this path says it’s deferring the announce, but it sets `didAnnounce = true` and returns `true`. `subagent-registry.finalizeSubagentCleanup` treats a `true` return as “announce handled” and won’t retry, so the parent run’s completion can be permanently skipped once descendants finish.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 882badba0a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 548 to 552
await callGateway({
method: "agent",
params: {
sessionKey: params.requesterSessionKey,
sessionKey: targetRequesterSessionKey,
message: triggerMessage,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wait for final announce completion before treating it as delivered

This announce dispatch no longer uses expectFinal: true, so the gateway call resolves on the initial status: "accepted" ack rather than the final run result. runSubagentAnnounceFlow then marks the announce as successful and cleanup proceeds, which means a later failure in the queued announce run (model/auth/runtime error) is never retried and the user-facing subagent completion update can be silently dropped.

Useful? React with 👍 / 👎.

Comment on lines +355 to +357
return {
requesterSessionKey: spawnedBy || callerSessionKey,
callerSessionKey,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict leaf subagents to list-only scope

Leaf subagents are remapped to their parent requester scope here, and later subagents actions (kill/steer) run against that scope without a depth/orchestrator gate. In practice, a depth-2 worker can kill or steer sibling runs that it did not spawn, which contradicts the policy intent that leaf subagents access is for visibility/status only and can disrupt unrelated parallel work.

Useful? React with 👍 / 👎.

@tyler6204 tyler6204 force-pushed the codex/subagent-improvements branch from e84167c to 4e476ae Compare February 12, 2026 07:15
@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.
… enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.
- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.
@tyler6204 tyler6204 force-pushed the codex/subagent-improvements branch from a14a1e5 to aa44a9c Compare February 15, 2026 06:03
@tyler6204 tyler6204 merged commit b8f66c2 into main Feb 15, 2026
9 checks passed
@tyler6204 tyler6204 deleted the codex/subagent-improvements branch February 15, 2026 06:03
@tyler6204
Copy link
Member Author

Merged via squash.

Thanks @tyler6204!

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

GwonHyeok pushed a commit to learners-superpumped/openclaw that referenced this pull request Feb 15, 2026
…t token waste (openclaw#14447)

* Agents: add subagent orchestration controls

* Agents: add subagent orchestration controls (WIP uncommitted changes)

* feat(subagents): add depth-based spawn gating for sub-sub-agents

* feat(subagents): tool policy, registry, and announce chain for nested agents

* feat(subagents): system prompt, docs, changelog for nested sub-agents

* fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback

Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex),
the fallback candidate logic in resolveFallbackCandidates silently appended the
global primary model (opus) as a backstop. On reinjection/steer with a transient
error, the session could fall back to opus which has a smaller context window
and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? []
instead of undefined, preventing the implicit primary backstop.

Bug 2: Active subagents showed 'model n/a' in /subagents list because
resolveModelDisplay only read entry.model/modelProvider (populated after run
completes). Fix: fall back to modelOverride/providerOverride fields which are
populated at spawn time via sessions.patch.

Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could
theoretically escape runEmbeddedPiAgent and be treated as failover candidates
in runWithModelFallback, causing a switch to a model with a smaller context
window. Fix: in runWithModelFallback, detect context overflow errors via
isLikelyContextOverflowError and rethrow them immediately instead of trying the
next model candidate.

* fix(subagents): track spawn depth in session store and fix announce routing for nested agents

* Fix compaction status tracking and dedupe overflow compaction triggers

* fix(subagents): enforce depth block via session store and implement cascade kill

* fix: inject group chat context into system prompt

* fix(subagents): always write model to session store at spawn time

* Preserve spawnDepth when agent handler rewrites session entry

* fix(subagents): suppress announce on steer-restart

* fix(subagents): fallback spawned session model to runtime default

* fix(subagents): enforce spawn depth when caller key resolves by sessionId

* feat(subagents): implement active-first ordering for numeric targets and enhance task display

- Added a test to verify that subagents with numeric targets follow an active-first list ordering.
- Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity.
- Enhanced task display in command responses to prevent truncation of long task descriptions.
- Introduced new utility functions for compacting task text and managing subagent run states.

* fix(subagents): show model for active runs via run record fallback

When the spawned model matches the agent's default model, the session
store's override fields are intentionally cleared (isDefault: true).
The model/modelProvider fields are only populated after the run
completes. This left active subagents showing 'model n/a'.

Fix: store the resolved model on SubagentRunRecord at registration
time, and use it as a fallback in both display paths (subagents tool
and /subagents command) when the session store entry has no model info.

Changes:
- SubagentRunRecord: add optional model field
- registerSubagentRun: accept and persist model param
- sessions-spawn-tool: pass resolvedModel to registerSubagentRun
- subagents-tool: pass run record model as fallback to resolveModelDisplay
- commands-subagents: pass run record model as fallback to resolveModelDisplay

* feat(chat): implement session key resolution and reset on sidebar navigation

- Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar.
- Updated the `renderTab` function to handle session key changes when navigating to the chat tab.
- Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation.

* fix: subagent timeout=0 passthrough and fallback prompt duplication

Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default
- sessions-spawn-tool: default to undefined (not 0) when neither timeout param
  is provided; use != null check so explicit 0 passes through to gateway
- agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles
  0 → MAX_SAFE_TIMEOUT_MS)

Bug 2: model fallback no longer re-injects the original prompt as a duplicate
- agent.ts: track fallback attempt index; on retries use a short continuation
  message instead of the full original prompt since the session file already
  contains it from the first attempt
- Also skip re-sending images on fallback retries (already in session)

* feat(subagents): truncate long task descriptions in subagents command output

- Introduced a new utility function to format task previews, limiting their length to improve readability.
- Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately.
- Adjusted related tests to verify that long task descriptions are now truncated in the output.

* refactor(subagents): update subagent registry path resolution and improve command output formatting

- Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically.
- Enhanced the formatting of command output for active and recent subagents, adding separators for better readability.
- Updated related tests to reflect changes in command output structure.

* fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted

The previous fix (75a791106) correctly handled the case where
runTimeoutSeconds was explicitly set to 0 ("no timeout"). However,
when models omit the parameter entirely (which is common since the
schema marks it as optional), runTimeoutSeconds resolved to undefined.

undefined flowed through the chain as:
  sessions_spawn → timeout: undefined (since undefined != null is false)
  → gateway agent handler → agentCommand opts.timeout: undefined
  → resolveAgentTimeoutMs({ overrideSeconds: undefined })
  → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes)

This caused subagents to be killed at exactly 10 minutes even though
the user's intent (via TOOLS.md) was for subagents to run without a
timeout.

Fix: default runTimeoutSeconds to 0 (no timeout) when neither
runTimeoutSeconds nor timeoutSeconds is provided by the caller.
Subagent spawns are long-running by design and should not inherit the
600s agent-command default timeout.

* fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default)

* fix: thread timeout override through getReplyFromConfig dispatch path

getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override,
always falling back to the config default (600s). Add timeoutOverrideSeconds
to GetReplyOptions and pass it through as overrideSeconds so callers of the
dispatch chain can specify a custom timeout (0 = no timeout).

This complements the existing timeout threading in agentCommand and the
cron isolated-agent runner, which already pass overrideSeconds correctly.

* feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling

- Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution.
- Updated the `resolveFallbackCandidates` function to utilize the new normalization logic.
- Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms.
- Introduced a new test case to ensure that the normalization process works as expected for various input formats.

* feat(tests): add unit tests for steer failure behavior in openclaw-tools

- Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails.
- Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected.
- Enhanced the subagent registry with a new function to clear steer restart suppression.
- Updated related components to support the new test scenarios.

* fix(subagents): replace stop command with kill in slash commands and documentation

- Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs.
- Modified related documentation to reflect the change in command usage.
- Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling.
- Enhanced tests to ensure correct behavior of the updated commands and their interactions.

* feat(tests): add unit tests for readLatestAssistantReply function

- Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios.
- Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text.
- Mocked the gateway call to simulate different message histories for comprehensive testing.

* feat(tests): enhance subagent kill-all cascade tests and announce formatting

- Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents.
- Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content.
- Improved the handling of long findings and stats in the announce formatting logic to ensure concise output.
- Refactored related functions to enhance clarity and maintainability in the subagent registry and tools.

* refactor(subagent): update announce formatting and remove unused constants

- Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests.
- Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic.
- Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs.
- Cleaned up unused imports in the commands-subagents file to enhance code clarity.

* feat(tests): enhance billing error handling in user-facing text

- Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context.
- Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages.
- Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output.
- Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements.

* feat(subagent): enhance workflow guidance and auto-announcement clarity

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.

* fix(cron): avoid announcing interim subagent spawn acks

* chore: clean post-rebase imports

* fix(cron): fall back to child replies when parent stays interim

* fix(subagents): make active-run guidance advisory

* fix(subagents): update announce flow to handle active descendants and enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.

* fix(subagents): enhance announce flow and formatting for user updates

- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.

* fix: resolve prep blockers and changelog placement (openclaw#14447) (thanks @tyler6204)

* fix: restore cron delivery-plan import after rebase (openclaw#14447) (thanks @tyler6204)

* fix: resolve test failures from rebase conflicts (openclaw#14447) (thanks @tyler6204)

* fix: apply formatting after rebase (openclaw#14447) (thanks @tyler6204)
Benkei-dev pushed a commit to Benkei-dev/openclaw that referenced this pull request Feb 15, 2026
…t token waste (openclaw#14447)

* Agents: add subagent orchestration controls

* Agents: add subagent orchestration controls (WIP uncommitted changes)

* feat(subagents): add depth-based spawn gating for sub-sub-agents

* feat(subagents): tool policy, registry, and announce chain for nested agents

* feat(subagents): system prompt, docs, changelog for nested sub-agents

* fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback

Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex),
the fallback candidate logic in resolveFallbackCandidates silently appended the
global primary model (opus) as a backstop. On reinjection/steer with a transient
error, the session could fall back to opus which has a smaller context window
and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? []
instead of undefined, preventing the implicit primary backstop.

Bug 2: Active subagents showed 'model n/a' in /subagents list because
resolveModelDisplay only read entry.model/modelProvider (populated after run
completes). Fix: fall back to modelOverride/providerOverride fields which are
populated at spawn time via sessions.patch.

Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could
theoretically escape runEmbeddedPiAgent and be treated as failover candidates
in runWithModelFallback, causing a switch to a model with a smaller context
window. Fix: in runWithModelFallback, detect context overflow errors via
isLikelyContextOverflowError and rethrow them immediately instead of trying the
next model candidate.

* fix(subagents): track spawn depth in session store and fix announce routing for nested agents

* Fix compaction status tracking and dedupe overflow compaction triggers

* fix(subagents): enforce depth block via session store and implement cascade kill

* fix: inject group chat context into system prompt

* fix(subagents): always write model to session store at spawn time

* Preserve spawnDepth when agent handler rewrites session entry

* fix(subagents): suppress announce on steer-restart

* fix(subagents): fallback spawned session model to runtime default

* fix(subagents): enforce spawn depth when caller key resolves by sessionId

* feat(subagents): implement active-first ordering for numeric targets and enhance task display

- Added a test to verify that subagents with numeric targets follow an active-first list ordering.
- Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity.
- Enhanced task display in command responses to prevent truncation of long task descriptions.
- Introduced new utility functions for compacting task text and managing subagent run states.

* fix(subagents): show model for active runs via run record fallback

When the spawned model matches the agent's default model, the session
store's override fields are intentionally cleared (isDefault: true).
The model/modelProvider fields are only populated after the run
completes. This left active subagents showing 'model n/a'.

Fix: store the resolved model on SubagentRunRecord at registration
time, and use it as a fallback in both display paths (subagents tool
and /subagents command) when the session store entry has no model info.

Changes:
- SubagentRunRecord: add optional model field
- registerSubagentRun: accept and persist model param
- sessions-spawn-tool: pass resolvedModel to registerSubagentRun
- subagents-tool: pass run record model as fallback to resolveModelDisplay
- commands-subagents: pass run record model as fallback to resolveModelDisplay

* feat(chat): implement session key resolution and reset on sidebar navigation

- Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar.
- Updated the `renderTab` function to handle session key changes when navigating to the chat tab.
- Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation.

* fix: subagent timeout=0 passthrough and fallback prompt duplication

Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default
- sessions-spawn-tool: default to undefined (not 0) when neither timeout param
  is provided; use != null check so explicit 0 passes through to gateway
- agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles
  0 → MAX_SAFE_TIMEOUT_MS)

Bug 2: model fallback no longer re-injects the original prompt as a duplicate
- agent.ts: track fallback attempt index; on retries use a short continuation
  message instead of the full original prompt since the session file already
  contains it from the first attempt
- Also skip re-sending images on fallback retries (already in session)

* feat(subagents): truncate long task descriptions in subagents command output

- Introduced a new utility function to format task previews, limiting their length to improve readability.
- Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately.
- Adjusted related tests to verify that long task descriptions are now truncated in the output.

* refactor(subagents): update subagent registry path resolution and improve command output formatting

- Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically.
- Enhanced the formatting of command output for active and recent subagents, adding separators for better readability.
- Updated related tests to reflect changes in command output structure.

* fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted

The previous fix (75a791106) correctly handled the case where
runTimeoutSeconds was explicitly set to 0 ("no timeout"). However,
when models omit the parameter entirely (which is common since the
schema marks it as optional), runTimeoutSeconds resolved to undefined.

undefined flowed through the chain as:
  sessions_spawn → timeout: undefined (since undefined != null is false)
  → gateway agent handler → agentCommand opts.timeout: undefined
  → resolveAgentTimeoutMs({ overrideSeconds: undefined })
  → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes)

This caused subagents to be killed at exactly 10 minutes even though
the user's intent (via TOOLS.md) was for subagents to run without a
timeout.

Fix: default runTimeoutSeconds to 0 (no timeout) when neither
runTimeoutSeconds nor timeoutSeconds is provided by the caller.
Subagent spawns are long-running by design and should not inherit the
600s agent-command default timeout.

* fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default)

* fix: thread timeout override through getReplyFromConfig dispatch path

getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override,
always falling back to the config default (600s). Add timeoutOverrideSeconds
to GetReplyOptions and pass it through as overrideSeconds so callers of the
dispatch chain can specify a custom timeout (0 = no timeout).

This complements the existing timeout threading in agentCommand and the
cron isolated-agent runner, which already pass overrideSeconds correctly.

* feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling

- Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution.
- Updated the `resolveFallbackCandidates` function to utilize the new normalization logic.
- Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms.
- Introduced a new test case to ensure that the normalization process works as expected for various input formats.

* feat(tests): add unit tests for steer failure behavior in openclaw-tools

- Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails.
- Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected.
- Enhanced the subagent registry with a new function to clear steer restart suppression.
- Updated related components to support the new test scenarios.

* fix(subagents): replace stop command with kill in slash commands and documentation

- Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs.
- Modified related documentation to reflect the change in command usage.
- Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling.
- Enhanced tests to ensure correct behavior of the updated commands and their interactions.

* feat(tests): add unit tests for readLatestAssistantReply function

- Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios.
- Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text.
- Mocked the gateway call to simulate different message histories for comprehensive testing.

* feat(tests): enhance subagent kill-all cascade tests and announce formatting

- Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents.
- Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content.
- Improved the handling of long findings and stats in the announce formatting logic to ensure concise output.
- Refactored related functions to enhance clarity and maintainability in the subagent registry and tools.

* refactor(subagent): update announce formatting and remove unused constants

- Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests.
- Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic.
- Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs.
- Cleaned up unused imports in the commands-subagents file to enhance code clarity.

* feat(tests): enhance billing error handling in user-facing text

- Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context.
- Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages.
- Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output.
- Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements.

* feat(subagent): enhance workflow guidance and auto-announcement clarity

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.

* fix(cron): avoid announcing interim subagent spawn acks

* chore: clean post-rebase imports

* fix(cron): fall back to child replies when parent stays interim

* fix(subagents): make active-run guidance advisory

* fix(subagents): update announce flow to handle active descendants and enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.

* fix(subagents): enhance announce flow and formatting for user updates

- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.

* fix: resolve prep blockers and changelog placement (openclaw#14447) (thanks @tyler6204)

* fix: restore cron delivery-plan import after rebase (openclaw#14447) (thanks @tyler6204)

* fix: resolve test failures from rebase conflicts (openclaw#14447) (thanks @tyler6204)

* fix: apply formatting after rebase (openclaw#14447) (thanks @tyler6204)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui cli CLI command changes commands Command implementations docs Improvements or additions to documentation gateway Gateway runtime maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments