-
-
Notifications
You must be signed in to change notification settings - Fork 39.8k
Closed
Description
Summary
every-type cron jobs (e.g. everyMs: 600000) never execute because onTimer calls recomputeNextRuns() before runDueJobs(), pushing nextRunAtMs past now on every tick.
Environment
- OpenClaw version: 2026.2.3-1
- OS: macOS (Darwin 24.5.0, arm64)
- Node.js: v22.22.0
- Gateway: Running as daemon
Root Cause
In src/cron/service/timer.ts, the onTimer flow is:
async function onTimer(state) {
if (state.running) return;
state.running = true;
try {
await locked(state, async () => {
await ensureLoaded(state, { forceReload: true }); // ← reloads + recomputes
await runDueJobs(state); // ← checks due AFTER recompute
await persist(state);
armTimer(state);
});
} finally {
state.running = false;
}
}ensureLoaded(forceReload: true) calls recomputeNextRuns(), which calls computeNextRunAtMs() for each job.
For every-type jobs, computeNextRunAtMs uses a ceiling division:
const anchor = Math.max(0, Math.floor(schedule.anchorMs ?? nowMs));
const elapsed = nowMs - anchor;
return anchor + Math.max(1, Math.floor((elapsed + everyMs - 1) / everyMs)) * everyMs;The race:
- Timer fires at
T + ε(setTimeout always fires a few ms late) recomputeNextRunsruns withnow = T + ε- Without
anchorMs:anchor = now, so result isnow + everyMs→ always in the future - With
anchorMs: sincenow = T + ε > T(past the slot boundary), ceiling division returnsT + everyMs→ also in the future runDueJobschecksnow >= nextRunAtMs→T + ε >= T + everyMs→ false → not due- Timer re-armed for
T + everyMs. Same thing repeats forever.
The job is perpetually deferred because recomputeNextRuns always returns a strictly future time.
Steps to Reproduce
- Create an
every-type cron job:
{
"name": "Test every 5min",
"schedule": { "kind": "every", "everyMs": 300000 },
"sessionTarget": "isolated",
"enabled": true,
"payload": { "kind": "agentTurn", "message": "Say hello" }
}- Wait for the timer to fire
- Check
openclaw cron runs --id <jobId>— zero runs openclaw cron run <jobId> --forceworks fine (bypasses the timer path)
Observed Behavior
- All 5
every-type jobs stopped running after gateway restart - Zero runs recorded for 6+ hours despite correct
nextRunAtMsvalues nextRunAtMssilently advances byeveryMseach tick without executingcron.run --forceworks, confirming the scheduler and job execution are functionalcron-expression jobs may also be affected (none fired either after restart)
Evidence
- Email Checker (
every: 15min): ran fine 08:06–10:41 UTC, then zero runs - Docker Build Watcher (
every: 10min): ran fine until 10:47 UTC, then zero runs - Gateway restarted at 12:57 UTC — no jobs ran after restart
- Watched Docker Build Watcher:
nextRunAtMsmoved from1770312000000to1770312600000(+10min) without executing - Setting
anchorMsdid not help (ceiling division still overshoots by ε ms)
Suggested Fix
Option A: Snapshot nextRunAtMs before reload
async function onTimer(state) {
// ...
const dueSnapshot = state.store?.jobs
.filter(j => j.enabled && typeof j.state.nextRunAtMs === "number")
.map(j => ({ id: j.id, dueAt: j.state.nextRunAtMs }));
await ensureLoaded(state, { forceReload: true });
// Restore pre-reload nextRunAtMs for due check
for (const snap of dueSnapshot) {
const job = state.store.jobs.find(j => j.id === snap.id);
if (job && snap.dueAt <= now) job.state.nextRunAtMs = snap.dueAt;
}
await runDueJobs(state);
// ...
}Option B: Skip recomputeNextRuns in timer path
await ensureLoaded(state, { forceReload: true, skipRecompute: true });Option C: Change ceiling to floor in computeNextRunAtMs for the current-slot case:
// Return current slot if now is within it, next slot otherwise
const slot = Math.floor(elapsed / everyMs);
const slotStart = anchor + slot * everyMs;
return slotStart >= nowMs ? slotStart : slotStart + everyMs;Related
- sessions_spawn timeout overflow causes gateway hang (Node.js 32-bit setTimeout limit) #9572 —
sessions_spawnsetTimeout overflow (different bug, same gateway) - fix: clamp setTimeout values to 32-bit safe max to prevent gateway hang #9576 — fix for sessions_spawn timeout overflow causes gateway hang (Node.js 32-bit setTimeout limit) #9572 (does not address this issue)
- # OpenClaw Bug Report: One-Time Scheduled Jobs Not Firing #8558 — one-time
atjobs not firing (possibly related scheduler issue)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels