fix(cron): return accepted run ticket and reconcile timeout (AI-assisted)#29546
fix(cron): return accepted run ticket and reconcile timeout (AI-assisted)#29546HOYALIM wants to merge 3 commits intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8b3026708b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/gateway/server-methods/cron.ts
Outdated
| if ( | ||
| mode === "due" && | ||
| typeof job.state.nextRunAtMs === "number" && | ||
| Number.isFinite(job.state.nextRunAtMs) && | ||
| job.state.nextRunAtMs > now |
There was a problem hiding this comment.
Gate due-mode acceptance on full due predicate
The new cron.run fast-ack path only rejects due-mode requests when nextRunAtMs > now, but CronService.run() considers a job due only when it is enabled and has a numeric nextRunAtMs at or before now. As a result, disabled jobs (or jobs with missing/invalid nextRunAtMs) can be returned as status:"accepted" here and then silently resolve to ran:false in the async execution path, leaving a runId that never appears in cron.runs and causing incorrect reconciliation behavior.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR successfully addresses the Key changes:
Architecture decisions:
Testing coverage: Confidence Score: 4/5
Last reviewed commit: 8b30267 |
8b30267 to
5eb02cf
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5eb02cf40e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 739c6659c6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…cation, hook bridge Comprehensive cron subsystem reliability overhaul addressing 9 open issues: **Configurable retry policy (openclaw#29527, openclaw#29546)** - Extract backoff logic into `retry-policy.ts` with configurable `cron.retryBackoff` schedule and `cron.stuckRunTimeoutMs` config fields - Classify execution errors as transient (retry with backoff) or terminal (disable job immediately) to prevent futile retry storms - Add Zod validation for new config fields **Fix delivery status reporting (openclaw#29660)** - `resolveDeliveryStatus` now accepts `deliveryAttempted` and returns "not-delivered" when delivery was attempted but not confirmed - Add explicit `delivered: false` to all error paths in delivery-dispatch.ts - Pass `deliveryAttempted` through the full result chain **Configurable response prefix (openclaw#29687)** - Add `responsePrefix` to CronServiceDeps (defaults to "Cron") - Use dep-injected prefix for main session summary messages **Strip directive tags from cron output (openclaw#29646)** - Strip `[[reply_to_current]]`, `[[reply_to:<id>]]`, `[[audio_as_voice]]` from isolated cron run output before delivery **Startup schedule preservation (openclaw#29690)** - Use maintenance-only recompute in `start()` to avoid advancing past-due `nextRunAtMs` values for jobs that `runMissedJobs` intentionally skipped - Clear `nextRunAtMs` for interrupted jobs so they get fresh schedule computation **Internal hook bridge (openclaw#29682)** - Add "cron" to `InternalHookEventType` with `CronExecutionHookEvent` type - Bridge `emitJobFinished` to `triggerInternalHook` so extensions can react to cron job completions (alerting, chaining, etc.) - Add `isCronExecutionEvent` type guard **Per-job skip-if-active flag (openclaw#29659)** - Add `skipIfRunActive?: boolean` to CronJob type for semantic overlap prevention Tests: 290 cron tests + 21 new retry-policy tests pass. Fixes: openclaw#29527, openclaw#29546, openclaw#29601, openclaw#29646, openclaw#29659, openclaw#29660, openclaw#29682, openclaw#29687, openclaw#29690
|
Thanks for the detailed proposal and tests. I’m closing this for now because it introduces a broad contract shift for Appreciate the thoughtful effort here. |
|
Thanks for the thoughtful review and context, really appreciate you taking the time to dig into this. I’ll keep this branch around as a reference and will look for opportunities to upstream smaller, compatibility-focused pieces in future PRs. |
Summary
cron runcan time out at transport level while the job still executes later, causing clients/users to misclassify runs as failed and retrigger duplicates.cron.runnow returns an immediate accepted payload (accepted,requestId/runId,queuedAt, reconcile hint) and executes asynchronously.RUN_NOT_ACCEPTED) for missing/running/not-due cases.GatewayTransportTimeoutError) and CLI reconciliation (singlecron.runscheck on timeout-like responses).runIdpropagation/filtering across cron events, run logs, schema, validators, andcron.runs.pnpm build,pnpm check, and scoped tests), with known unrelated baseline failures in full localpnpm test.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
cron.runnow returns an immediate accepted ticket-style payload includingrequestId,runId,queuedAt, and reconciliation hint.cron.runssupportsrunIdfiltering for reconciliation.cron runperforms one reconciliation call on timeout-like transport errors instead of immediately presenting a hard failure.Security Impact (required)
No)No)No)No)No)Yes, explain risk + mitigation:Repro + Verification
Environment
Steps
cron.runin force mode.cron.runswith returnedrunId.Expected
requestId/runId/queuedAt.cron.runs.Actual
Evidence
Attach at least one:
Human Verification (required)
What you personally verified (not just CI), and how:
cron.runRUN_NOT_ACCEPTEDpayload shaperunIdfiltering incron.runspnpm testgreen run (current local suite has unrelated baseline failures outside this PR scope)Compatibility / Migration
Yes)No)No)Failure Recovery (if this breaks)
8b3026708(cron reliability changes)fb5fae88f(type-check cast fix for test)src/gateway/server-methods/cron.tssrc/cli/cron-cli/register.cron-simple.tssrc/gateway/call.tssrc/cron/*touched filescron.runreturns synchronous execution result instead of accepted payloadrunIdin run-log filtering or reconciliation pathsRisks and Mitigations
deduped: true.cron.runimmediate execution semantics may need adjustment.requestId,runId,queuedAt).