fix: preserve edited schedules across suspension rewrites#2043
fix: preserve edited schedules across suspension rewrites#2043
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR resolves a scheduler issue where edited DAG schedules during suspension fail to dispatch. It introduces schedule fingerprint tracking to detect schedule changes, resets skip-successful logic when schedules change, adds DAG mutation notifications to the frontend API and SSE system, and updates watermark versioning/migration to persist new state fields. Changes
Sequence Diagram(s)Not applicable. While the PR introduces significant functional changes (schedule fingerprinting, mutation notifications), the control flows are primarily internal to the scheduler or localized to the frontend API. The actual connections between the frontend notification system and scheduler reconciliation are not shown as a cohesive multi-component sequence in the diff. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/intg/sched_test.go`:
- Around line 192-249: The test is racy because the fake clock is set to
10:04:59 so the edited schedule/unsuspend can miss the 10:05:00 boundary; change
the clock base used in sc.SetClock from time.Date(2026,4,27,10,4,59,...) to an
earlier time (e.g., 10:04:00) so the scheduler has a full minute before the
target slot, and add an explicit scheduler-registration wait before removing
suspendFlag: replace or augment the initial require.Eventually(sc.IsRunning())
check with a require.Eventually that asserts sc.IsRunning() && the scheduler has
registered the DAG (inspect th.EntryReader.DAGs() for loaded.Name == dagName) so
unsuspend only happens after registration; keep using existing helpers like
pollSchedulerErr, writeSpec, suspendFlag and dispatchCount to locate the
relevant logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: acdeff8c-2756-4edc-84f3-272f696ba6c8
📒 Files selected for processing (12)
internal/intg/issue_2042_test.gointernal/intg/sched_test.gointernal/persis/filewatermark/store.gointernal/persis/filewatermark/store_test.gointernal/service/frontend/api/v1/api.gointernal/service/frontend/api/v1/dags.gointernal/service/frontend/api/v1/dags_internal_test.gointernal/service/frontend/server.gointernal/service/scheduler/schedule_state.gointernal/service/scheduler/tick_planner.gointernal/service/scheduler/tick_planner_test.gointernal/service/scheduler/watermark.go
Summary
Testing
Closes #2042
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
skip_if_successfullogic to correctly identify and skip runs from edited or removed schedules.Tests