fix: clamp cron timer delay to 60s as drift guard by zhiyuanw101 · Pull Request #10580 · openclaw/openclaw

zhiyuanw101 · 2026-02-06T17:51:56Z

Problem

The cron scheduler relies solely on setTimeout to trigger job execution. If the timer is silently lost — due to macOS sleep/wake, GC pause, process throttling, or any other runtime disruption — no recovery mechanism exists. The missed job slot is never executed until an external API call (e.g. cron list) happens to call ensureLoaded → recomputeNextRuns → armTimer.

Observed behavior: A cron job scheduled for 7:00 AM PST (0 7-19 * * 1-5) did not fire until manually triggered 2+ hours later. Gateway logs showed the timer was correctly armed after job creation but never fired, likely due to macOS power management suspending the Node.js event loop during sleep.

Solution

Clamp the maximum setTimeout delay in armTimer() to 60 seconds via DRIFT_GUARD_MAX_DELAY_MS. This means:

When a job is due in <60s: timer fires at the exact due time (no change)
When a job is due in >60s: timer fires every 60s, onTimer() finds no overdue jobs, does nothing, and re-arms
When a timer was lost (sleep/wake): the next 60s wake catches the overdue job within at most 1 minute

This is simpler and more robust than a separate setInterval drift guard:

No second timer to manage or leak
No interaction issues with fake timers in tests
Same setTimeout + re-arm pattern already used by the scheduler
Minimal overhead (~1 timer wake/minute when idle)

Changes

src/cron/service/timer.ts: Add DRIFT_GUARD_MAX_DELAY_MS = 60_000 constant, clamp delay in armTimer()
src/cron/service.drift-guard-catches-overdue-jobs.test.ts: Two new tests validating the drift guard behavior

Test Results

All 60 tests pass (58 existing + 2 new):

 Test Files  14 passed (14)
      Tests  60 passed (60)

Greptile Overview

Greptile Summary

Adds a drift-guard to the cron scheduler by clamping setTimeout delays in armTimer() to a maximum of 60s, so overdue jobs can be detected even if the original timer was lost (sleep/GC/throttling).
Introduces DRIFT_GUARD_MAX_DELAY_MS and updates timer arming to use the minimum of computed delay, Node’s max timeout, and the drift-guard cap.
Adds two Vitest tests that simulate timer loss and validate that the scheduler re-arms and eventually runs overdue jobs, while not executing far-future jobs early.

Confidence Score: 4/5

This PR is generally safe to merge; the core timer change is small and the added tests cover the intended drift-guard behavior.
The change is localized to clamping the timer delay and is backed by new tests that exercise overdue-job recovery and far-future behavior. The main outstanding concern is test resource cleanup patterns that can leak temp dirs/timers on assertion failure, which can cause flakiness in CI or local runs.
src/cron/service.drift-guard-catches-overdue-jobs.test.ts

_{(2/5) Greptile learns from your feedback when you react with thumbs up/down!}

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T18:05:52Z

src/cron/service.drift-guard-catches-overdue-jobs.test.ts

+  afterEach(() => {
+    vi.useRealTimers();
+  });


Cleanup skipped on failure

Both tests call cron.stop()/store.cleanup() at the end of the test body, so if an assertion throws earlier the temp dir is leaked and the CronService may keep timers open until afterEach runs. This can make the suite flaky over time (especially when running individual tests repeatedly). Consider moving cron.stop() + store.cleanup() into afterEach or a try/finally around the test body so cleanup always happens even on failure.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cron/service.drift-guard-catches-overdue-jobs.test.ts Line: 34:36 Comment: **Cleanup skipped on failure** Both tests call `cron.stop()`/`store.cleanup()` at the end of the test body, so if an assertion throws earlier the temp dir is leaked and the CronService may keep timers open until `afterEach` runs. This can make the suite flaky over time (especially when running individual tests repeatedly). Consider moving `cron.stop()` + `store.cleanup()` into `afterEach` or a `try/finally` around the test body so cleanup always happens even on failure. How can I resolve this? If you propose a fix, please make it concise.

The cron scheduler relies solely on setTimeout to trigger job execution. If the timer is silently lost (macOS sleep/wake, GC pause, process throttling), no recovery mechanism exists until an external API call happens to re-arm the timer. Clamp the maximum setTimeout delay in armTimer() to 60 seconds. When the timer fires before a job is due, onTimer() finds no overdue jobs, does nothing, and re-arms — effectively polling every 60s. When a job IS overdue (because the previous timer was lost), onTimer() catches it within at most 60 seconds of the missed slot. This is simpler and more robust than a separate setInterval drift guard: - No second timer to manage or leak - No interaction issues with fake timers in tests - Same setTimeout + re-arm pattern already used by the scheduler - Minimal overhead (one timer wake per minute when idle)

tyler6204 · 2026-02-08T03:47:13Z

Superseded by #11641 (merge commit: 8fae55e). Closing to reduce duplicate PR noise. Please open a new PR only if there is additional scope beyond this fix.

greptile-apps bot reviewed Feb 6, 2026

View reviewed changes

zhiyuanw101 force-pushed the fix/cron-drift-guard branch from 801fc4d to a3061c2 Compare February 6, 2026 18:15

tyler6204 mentioned this pull request Feb 8, 2026

fix(cron): share isolated announce flow + harden cron scheduling/delivery #11641

Merged

tyler6204 closed this Feb 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: clamp cron timer delay to 60s as drift guard#10580

fix: clamp cron timer delay to 60s as drift guard#10580
zhiyuanw101 wants to merge 1 commit intoopenclaw:mainfrom
zhiyuanw101:fix/cron-drift-guard

zhiyuanw101 commented Feb 6, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 6, 2026

Uh oh!

tyler6204 commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Uh oh!

Conversation

zhiyuanw101 commented Feb 6, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Test Results

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

tyler6204 commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

zhiyuanw101 commented Feb 6, 2026 •

edited by greptile-apps bot

Loading