fix(cron): prevent spin loop when job completes within scheduled second#18073
Merged
steipete merged 1 commit intoopenclaw:mainfrom Feb 16, 2026
Merged
Conversation
…nd (openclaw#17821) When a cron job fires and completes within the same wall-clock second it was scheduled for, the next-run computation could return undefined or the same second, causing the scheduler to re-trigger the job hundreds of times in a tight loop. Two-layer fix: 1. computeJobNextRunAtMs: When computeNextRunAtMs returns undefined for a cron-kind schedule (edge case where floored nowSecondMs matches the schedule), retry with the ceiling (next second) as reference time. This ensures we always get the next valid occurrence. 2. applyJobResult: Add MIN_REFIRE_GAP_MS (2s) safety net for cron-kind jobs. After a successful run, nextRunAtMs is guaranteed to be at least 2s in the future. This breaks any remaining spin-loop edge cases without affecting normal daily/hourly schedules (where the natural next run is hours/days away). Fixes openclaw#17821
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When a cron job (e.g.
0 13 * * *) fires and completes within the same wall-clock second it was scheduled for,computeNextRunAtMscan returnundefinedfor that second. This causes the scheduler to immediately recompute, find the job still "due", and re-trigger it — creating a spin loop of 100+ phantom executions per second.Reported in #17821 with clear evidence: jobs firing with
durationMs: 0andnextRunAtMsstuck at the same timestamp.Root Cause
computeNextRunAtMsfloorsnowMsto the current second boundary before asking croner for the next run. When the job completes within the same second it was scheduled for (common for fast isolated jobs), the floored time can still match the schedule, and depending on croner version/timezone,nextRun()may return the same second — which fails thenextMs > nowSecondMscheck and returnsundefined.undefinednextRunAtMs triggersrecomputeNextRunsto retry with the currentnowMs, which is still in the same second → same result → tight loop.Fix (two layers)
1.
computeJobNextRunAtMsfallback (jobs.ts)When
computeNextRunAtMsreturnsundefinedfor a cron-kind schedule, retry with the ceiling (next second) as reference time. This guarantees we always land on the next valid occurrence rather than returningundefined.2.
MIN_REFIRE_GAP_MSsafety net (timer.ts)After a successful cron job run, ensure
nextRunAtMsis at least 2 seconds in the future. This is a belt-and-suspenders guard that breaks any remaining spin-loop edge case. The 2s gap never affects normal schedules (where the natural next run is hours/days away) and only applies to cron-kind schedules (noteveryorat).Tests
0 13 * * *job completing in 7ms (within the scheduled second), verifies it fires exactly once andnextRunAtMsadvances to the next dayFixes #17821
Greptile Summary
Fixes a spin-loop bug (#17821) where a cron job completing within the same wall-clock second it was scheduled for would cause
computeNextRunAtMsto returnundefined, leading to immediate recomputation and 100+ phantom re-executions per second.computeJobNextRunAtMsfallback (jobs.ts): WhencomputeNextRunAtMsreturnsundefinedfor a cron-kind schedule, retries with the next-second ceiling as reference time, ensuring the next valid occurrence is always found.MIN_REFIRE_GAP_MSsafety net (timer.ts): After a successful cron job run, ensuresnextRunAtMsis at least 2 seconds in the future. This belt-and-suspenders guard only applies to cron-kind schedules and never affects normal schedules (where the natural next run is hours/days away).0 13 * * *job completing in 7ms within the scheduled second, verifying it fires exactly once andnextRunAtMscorrectly advances to the next day.Both fixes are properly scoped to
cron-kind schedules only, leavingeveryandatschedule types unaffected. The defense-in-depth approach is sound — thecomputeJobNextRunAtMsfallback addresses the root cause, while theMIN_REFIRE_GAP_MSguard protects against any remaining edge cases from croner/timezone interactions.Confidence Score: 5/5
Last reviewed commit: 93f0767