Skip to content

fix(cron): eliminate double-announce and replace delivery polling with push-based flow#39089

Merged
tyler6204 merged 2 commits intomainfrom
fix/cron-double-announce
Mar 7, 2026
Merged

fix(cron): eliminate double-announce and replace delivery polling with push-based flow#39089
tyler6204 merged 2 commits intomainfrom
fix/cron-double-announce

Conversation

@tyler6204
Copy link
Copy Markdown
Member

@tyler6204 tyler6204 commented Mar 7, 2026

Summary

Two related cron delivery bugs fixed in one commit:

1. Double-announce guard (original fix)

Early-return paths inside deliverViaAnnounce returned without setting deliveryAttempted = true. The heartbeat timer would then see deliveryAttempted = false and fire a redundant enqueueSystemEvent fallback, delivering the same message twice.

Fix: Both early-return paths (active-subagent suppression and stale-interim suppression) now set deliveryAttempted = true before returning.

2. Replace 500ms polling loop with push-based agent.wait

waitForDescendantSubagentSummary in subagent-followup.ts previously polled every 500ms via countActiveDescendantRuns until all descendants finished — wasting tokens and wall-clock time on each cron run that spawns subagents.

Fix: The polling loop is replaced with concurrent agent.wait RPC calls (one per active descendant run) via Promise.allSettled. Iterations continue only if new descendants appear (e.g. spawned by first-level subagents). Only a short bounded grace period (5s, 200ms poll) remains to capture the cron agent's post-orchestration synthesis after descendants settle.

Key improvements:

  • No more busy-polling; waits are event-driven via gateway RPC
  • Handles nested descendants (new runs that appear mid-wait get a second pass)
  • Gateway errors are absorbed gracefully via allSettled
  • FAST_TEST_MODE shrinks the grace period to 50ms in tests

Tests

  • New waitForDescendantSubagentSummary test suite (8 tests) covering push-based waits, error handling, multi-descendant cases, NO_REPLY, and early-exit paths
  • All existing cron tests pass (57 test files, 464 tests)

Thanks @tyler6204

@openclaw-barnacle openclaw-barnacle bot added size: M maintainer Maintainer-authored PR labels Mar 7, 2026
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 7, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Tight-loop gateway fanout in descendant subagent wait can amplify failures (DoS)
2 🟡 Medium Cron delivery can be silently suppressed by misusing deliveryAttempted flag (notification DoS)

1. 🟡 Tight-loop gateway fanout in descendant subagent wait can amplify failures (DoS)

Property Value
Severity Medium
CWE CWE-400
Location src/cron/isolated-agent/subagent-followup.ts:144-161

Description

The new push-based descendant waiting loop in waitForDescendantSubagentSummary() can spin rapidly and repeatedly fan out RPC calls when the gateway call fails fast.

Key points:

  • pendingRunIds is refreshed from listDescendantRunsForRequester() each iteration.
  • The loop performs Promise.allSettled([...pendingRunIds].map(runId => callGateway({method:"agent.wait", ...}).catch(() => undefined))).
  • If callGateway() fails quickly (e.g., gateway down / connection refused / close during handshake), the await Promise.allSettled(...) resolves quickly.
  • If the registry still reports those runs as active (endedAt unset), pendingRunIds remains non-empty and the while loop immediately repeats without any delay/backoff, causing:
    • high CPU usage in this cron worker
    • repeated WebSocket connection attempts and request amplification against the gateway
    • potentially large concurrent fanout per iteration when many descendants are active

This is an availability risk (internal DoS) that can be triggered by common infra failure modes (gateway unavailable) and becomes worse with higher configured subagent limits or multiple concurrent cron sessions.

Vulnerable code:

while (pendingRunIds.size > 0 && Date.now() < deadline) {
  const remainingMs = Math.max(1, deadline - Date.now());
  await Promise.allSettled(
    [...pendingRunIds].map((runId) =>
      callGateway({
        method: "agent.wait",
        params: { runId, timeoutMs: remainingMs },
        timeoutMs: remainingMs + 2_000,
      }).catch(() => undefined),
    ),
  );

  pendingRunIds = new Set<string>(getActiveRuns().map((e) => e.runId));
}

Recommendation

Add bounded retry/backoff and concurrency limiting for the push-wait rounds, and avoid immediate re-tries when the gateway is unhealthy.

Recommended changes:

  1. Backoff/sleep per iteration (especially after any failures) to prevent tight loops.
  2. Cap concurrency when waiting on many runIds (batching or p-limit).
  3. Stop retrying on repeated gateway failures and fall back to a slower polling strategy (or break early and rely on grace-period polling / fallback reply).

Example mitigation (sketch):

import pLimit from "p-limit";

const limit = pLimit(10); // or config-driven
let backoffMs = 200;
let consecutiveGatewayFailures = 0;

while (pendingRunIds.size > 0 && Date.now() < deadline) {
  const remainingMs = Math.max(1, deadline - Date.now());

  const results = await Promise.allSettled(
    [...pendingRunIds].map((runId) =>
      limit(() =>
        callGateway({
          method: "agent.wait",
          params: { runId, timeoutMs: remainingMs },
          timeoutMs: remainingMs + 2_000,
        })
      )
    )
  );

  const anyRejected = results.some((r) => r.status === "rejected");
  if (anyRejected) {
    consecutiveGatewayFailures += 1;
    await new Promise((r) => setTimeout(r, backoffMs));
    backoffMs = Math.min(backoffMs * 2, 5_000);
    if (consecutiveGatewayFailures >= 3) {
      break; // or switch to slower poll mode
    }
  } else {
    consecutiveGatewayFailures = 0;
    backoffMs = 200;
  }

  pendingRunIds = new Set(getActiveRuns().map((e) => e.runId));
}

Also consider enforcing a maximum descendant run count to wait on per request (even if spawning is limited elsewhere) to prevent registry corruption/misconfiguration from causing unbounded fanout.


2. 🟡 Cron delivery can be silently suppressed by misusing deliveryAttempted flag (notification DoS)

Property Value
Severity Medium
CWE CWE-440
Location src/cron/isolated-agent/delivery-dispatch.ts:319-349

Description

In dispatchCronDelivery(), two early-return suppression paths now set deliveryAttempted = true even though no outbound delivery was attempted.

Because deliveryAttempted is used by the cron timer’s fallback guard (shouldEnqueueCronMainSummary) to decide whether to enqueue a system-event summary, this change can suppress the only remaining delivery mechanism in some cases.

Impact:

  • When activeSubagentRuns > 0 after waiting, dispatch returns without sending any message, but deliveryAttempted=true prevents the timer from enqueueing enqueueSystemEvent fallback.
  • When the "stale interim" suppression triggers (descendants existed but no synthesized update arrived), dispatch returns without delivering, but again marks deliveryAttempted=true and suppresses the timer fallback.
  • If a descendant run is buggy/malicious and never ends (or if descendant completion announcements cannot reach the user due to missing/invalid delivery origin), the user may receive no completion signal and no fallback summary.

Vulnerable logic (suppression paths marked as “attempted”):

if (activeSubagentRuns > 0) {
  deliveryAttempted = true;
  return params.withRunSession({ status: "ok", summary, outputText, deliveryAttempted, ... });
}

if (hadDescendants && /* stale interim */) {
  deliveryAttempted = true;
  return params.withRunSession({ status: "ok", summary, outputText, deliveryAttempted, ... });
}

Why this is availability-relevant:

  • The timer fallback only triggers when deliveryAttempted !== true.
  • Setting deliveryAttempted=true here changes the meaning from “an outbound send was attempted” to “we intentionally suppressed delivery”, which downstream logic does not distinguish.

This enables a notification DoS where cron runs can end with no user-visible message if descendants keep activeSubagentRuns > 0 (or keep the stale-interim condition true) long enough.

Recommendation

Treat “suppressed” as a distinct state from “attempted delivery”, so the timer can apply an appropriate fallback policy.

Recommended fixes (pick one):

  1. Introduce a separate flag (e.g. deliverySuppressed / suppressionReason) and keep deliveryAttempted reserved for true outbound attempts.
// suppression path
const deliverySuppressed = true;
return params.withRunSession({
  status: "ok",
  summary,
  outputText,
  deliveryAttempted: false,
  deliverySuppressed,
  ...params.telemetry,
});

Then update the timer fallback logic to consider deliverySuppressed (e.g., enqueue a non-user-facing internal event, or enqueue a user-facing “still running” / “timed out waiting for descendants” message after a cap).

  1. Add a bounded suppression timeout: if descendants remain active beyond a maximum, enqueue a fallback system event (or schedule a follow-up dispatch) so the user receives some completion/timeout signal.

  2. If deliveryAttempted must remain true to prevent double-announces, add an explicit mechanism that guarantees a later delivery attempt (e.g., on descendant settle) and/or emits an internal alert when suppression happens for too long.


Analyzed PR: #39089 at commit cfe819e

Last updated on: 2026-03-07T21:06:02Z

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 7, 2026

Greptile Summary

This PR fixes a double-announce bug in cron job delivery where two early-return paths inside deliverViaAnnounce (active subagent suppression at lines 319-330 and stale interim message suppression at lines 332-350) returned without setting deliveryAttempted = true. The timer's shouldEnqueueCronMainSummary guard in heartbeat-policy.ts (line 45) checks params.deliveryAttempted !== true — when both delivered and deliveryAttempted were false/undefined, the guard would fire enqueueSystemEvent, sending a redundant second message.

Key changes:

  • In delivery-dispatch.ts, both early-return paths now set deliveryAttempted = true and pass it explicitly to withRunSession before returning — correctly signaling that delivery was intentionally suppressed, not missed.
  • A new test file (delivery-dispatch.double-announce.test.ts) adds 5 targeted tests covering both suppression paths, normal announce success, announce-failure fallback to direct delivery, and no-delivery-requested scenarios.

Implementation detail: The fix correctly distinguishes between the SILENT_REPLY_TOKEN early return (lines 351-358, which sets delivered: true in the returned result and does not require deliveryAttempted) and the two suppression paths (which now correctly set deliveryAttempted = true). Both approaches prevent the timer's fallback guard from firing — one by marking delivery as successful, the other by marking it as intentionally suppressed. All 457 existing cron tests continue to pass.

Confidence Score: 5/5

  • This PR is safe to merge — the fix is minimal, targeted, and well-tested with zero risk of regression.
  • The change is confined to two symmetric, low-risk additions of deliveryAttempted = true in clearly-identified early-return paths. The root cause analysis matches the code exactly: both early-return paths were missing the flag that the shouldEnqueueCronMainSummary guard checks to prevent the fallback. The heartbeat-policy.ts guard logic (line 45) confirms exactly why the fix works. The five new tests directly exercise both bug paths, normal announce success, announce-failure fallback, and the no-delivery-requested path, with assertions on both the state flags and the timer guard itself. The fix does not alter any other code paths or existing behavior — it only prevents the unintended timer fallback. All 457 existing cron tests continue to pass.
  • No files require special attention.

Last reviewed commit: bb765a3

@tyler6204 tyler6204 force-pushed the fix/cron-double-announce branch from bb765a3 to 5d9bbd1 Compare March 7, 2026 18:48
@tyler6204 tyler6204 changed the title fix(cron): prevent double-announce when deliverViaAnnounce returns early fix(cron): eliminate double-announce and replace delivery polling with push-based flow Mar 7, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d9bbd1390

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +142 to +145
let pendingRunIds = new Set<string>(initialActiveRuns.map((e) => e.runId));

while (pendingRunIds.size > 0 && Date.now() < deadline) {
const remainingMs = Math.max(1, deadline - Date.now());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep polling for late-starting descendants

When observedActiveDescendants is true but initialActiveRuns is empty (a race that happens when the parent emits a follow-up hint before the registry marks child runs active), pendingRunIds starts empty and this loop is skipped entirely. The new implementation then never re-checks listDescendantRunsForRequester for descendants that become active a moment later, so dispatchCronDelivery can proceed with stale/interim text instead of waiting for the real subagent result.

Useful? React with 👍 / 👎.

Comment on lines +150 to +154
callGateway<{ status?: string }>({
method: "agent.wait",
params: { runId, timeoutMs: remainingMs },
timeoutMs: remainingMs + 2_000,
}).catch(() => undefined),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add backoff after agent.wait failures

This call path swallows agent.wait errors and immediately continues the outer loop; if the gateway is temporarily unavailable and callGateway fails fast, the same active run IDs are retried in a tight loop until deadline. That can create high CPU usage and rapid reconnect churn during outages, unlike the previous fixed polling delay behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7a7241ff6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +142 to +144
let pendingRunIds = new Set<string>(initialActiveRuns.map((e) => e.runId));

while (pendingRunIds.size > 0 && Date.now() < deadline) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep discovering descendants after an empty initial snapshot

When observedActiveDescendants is true but initialActiveRuns is empty (a common registry timing race), this code initializes pendingRunIds to an empty set and skips the wait loop entirely, so it never refreshes descendant activity before moving to grace-period transcript polling. That can return undefined/interim text even though descendants become active moments later, which regresses the old behavior that re-checked activity each cycle. Fresh evidence in this commit is that pendingRunIds is seeded once from initialActiveRuns and the while (pendingRunIds.size > 0 ...) guard blocks any later getActiveRuns() refresh when the first snapshot is empty.

Useful? React with 👍 / 👎.

Comment on lines +150 to +154
callGateway<{ status?: string }>({
method: "agent.wait",
params: { runId, timeoutMs: remainingMs },
timeoutMs: remainingMs + 2_000,
}).catch(() => undefined),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Back off after fast agent.wait gateway failures

The per-run agent.wait call now catches and suppresses gateway errors, but the outer loop has no delay path when those failures are immediate; if the gateway is temporarily down and callGateway fails fast, the same active run IDs are retried in a tight loop until deadline. This can create avoidable CPU/reconnect churn during outages. Fresh evidence in this commit is the new callGateway(...).catch(() => undefined) branch inside the retry loop without any sleep/backoff before the next iteration.

Useful? React with 👍 / 👎.

…h push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).
@tyler6204 tyler6204 force-pushed the fix/cron-double-announce branch from c7a7241 to cfe819e Compare March 7, 2026 20:13
@tyler6204 tyler6204 merged commit e554c59 into main Mar 7, 2026
10 of 11 checks passed
@tyler6204 tyler6204 deleted the fix/cron-double-announce branch March 7, 2026 20:13
@tyler6204
Copy link
Copy Markdown
Member Author

Merged via squash.

Merge commit: e554c59

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cfe819eaaa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +142 to +144
let pendingRunIds = new Set<string>(initialActiveRuns.map((e) => e.runId));

while (pendingRunIds.size > 0 && Date.now() < deadline) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep discovering descendants after empty initial snapshot

When observedActiveDescendants is true but initialActiveRuns is empty (a timing race where descendants are observed by the caller before the registry snapshot includes them), pendingRunIds is initialized empty and this loop never runs. That means getActiveRuns() is never refreshed before moving to grace-period transcript polling, so late-starting descendants can be missed and cron delivery can proceed with stale/interim text instead of waiting for the actual descendant completion.

Useful? React with 👍 / 👎.

method: "agent.wait",
params: { runId, timeoutMs: remainingMs },
timeoutMs: remainingMs + 2_000,
}).catch(() => undefined),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add retry backoff after failed agent.wait calls

Each agent.wait call suppresses gateway errors via .catch(() => undefined), but the outer loop immediately retries the same active run IDs with no sleep. If the gateway is temporarily unavailable and failures are fast, this produces a tight retry loop until deadline, driving unnecessary CPU usage and connection churn in the cron worker.

Useful? React with 👍 / 👎.

vincentkoc pushed a commit to BryanTegomoh/openclaw-fork that referenced this pull request Mar 8, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
openperf pushed a commit to openperf/moltbot that referenced this pull request Mar 8, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
mcaxtr pushed a commit to mcaxtr/openclaw that referenced this pull request Mar 8, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
Saitop pushed a commit to NomiciAI/openclaw that referenced this pull request Mar 8, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
GordonSH-oss pushed a commit to GordonSH-oss/openclaw that referenced this pull request Mar 9, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
jenawant pushed a commit to jenawant/openclaw that referenced this pull request Mar 10, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
dhoman pushed a commit to dhoman/chrono-claw that referenced this pull request Mar 11, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
senw-developers pushed a commit to senw-developers/va-openclaw that referenced this pull request Mar 17, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
V-Gutierrez pushed a commit to V-Gutierrez/openclaw-vendor that referenced this pull request Mar 17, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 21, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)

(cherry picked from commit e554c59)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 21, 2026
…h push-based flow (openclaw#39089)

* fix(cron): eliminate double-announce and replace delivery polling with push-based flow

- Set deliveryAttempted=true in announce early-return paths (active-subagent
  suppression and stale-interim suppression) so the heartbeat timer no longer
  fires a redundant enqueueSystemEvent fallback (double-announce bug).

- Refactor waitForDescendantSubagentSummary to use event-based agent.wait RPC
  calls instead of a 500ms busy-poll loop.  Each active descendant run is now
  awaited concurrently via Promise.allSettled, and only a short bounded grace
  period (5s) remains to capture the cron agent's post-orchestration synthesis.
  Eliminates O(n*timeoutMs/500ms) gateway calls and wasted wall-clock time.

- Add FAST_TEST_MODE (OPENCLAW_TEST_FAST=1) to subagent-followup.ts to keep
  the grace-period tests instant in CI.

- Add comprehensive tests for the new waitForDescendantSubagentSummary behaviour
  (push-based wait, error resilience, NO_REPLY handling, multi-descendant waits).

* fix: prep cron double-announce followup tests (openclaw#39089) (thanks @tyler6204)

(cherry picked from commit e554c59)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant