fix: cleanup update-marked output files of failed jobs if there was no backup to restore them#3843
Conversation
…o backup to restore them
📝 WalkthroughWalkthroughModifies Job.cleanup to accept skip_outputs and update cleanup selection; Job.postprocess tracks outputs restored from backup and passes them to cleanup to avoid removal. Removes GroupJob.cleanup. persistence.restore_output now returns a boolean indicating whether a backup was restored. Adds a failing-update test and test data. Changes
Sequence DiagramsequenceDiagram
participant Post as Job.postprocess
participant Persist as persistence.restore_output
participant Cleanup as Job.cleanup
participant FS as Filesystem
Post->>Persist: attempt to restore backup for updated output
alt restore returns True
Persist->>FS: restore file/dir from backup
Persist-->>Post: True
Post->>Post: add output to skip_cleanup_outputs
else restore returns False
Persist-->>Post: False
end
Post->>Cleanup: cleanup(skip_outputs=skip_cleanup_outputs)
Cleanup->>FS: remove outputs NOT in skip_outputs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.py⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (3)📚 Learning: 2025-05-23T09:40:24.474ZApplied to files:
📚 Learning: 2024-10-14T09:42:11.571ZApplied to files:
📚 Learning: 2025-05-06T01:37:23.382ZApplied to files:
🧬 Code graph analysis (1)src/snakemake/jobs.py (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/snakemake/jobs.py(4 hunks)src/snakemake/persistence.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/persistence.pysrc/snakemake/jobs.py
🧠 Learnings (1)
📚 Learning: 2025-05-23T09:40:24.474Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3600
File: src/snakemake/jobs.py:960-964
Timestamp: 2025-05-23T09:40:24.474Z
Learning: In the `cleanup` method of the `Job` class in `src/snakemake/jobs.py`, files in the `to_remove` list should be formatted with `fmt_iofile` without specifying `as_output=True` or `as_input=True` parameters, as these files should be displayed as generic files rather than specifically as output files.
Applied to files:
src/snakemake/jobs.py
🧬 Code graph analysis (1)
src/snakemake/jobs.py (1)
src/snakemake/persistence.py (2)
cleanup(418-420)restore_output(311-326)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: apidocs
🔇 Additional comments (2)
src/snakemake/persistence.py (1)
311-327: Verification complete: allrestore_outputcall sites correctly use the new boolean APIFound one call site in
src/snakemake/jobs.py:1245treating the boolean return value correctly—no staleNonecomparisons or legacy API usage. The implementation matches the intended semantics:Falsewhen no backup exists,Trueafter successful restore, and exceptions propagate for real copy errors.The optional docstring refinement (explicitly stating "False means no backup found; copy errors still raise") remains a nice-to-have but is not required.
src/snakemake/jobs.py (1)
954-978: AllJob.cleanupcall sites are correctly updated; type hint improvement is optionalThe verification confirms that the only
Job.cleanupcall site (line 1321 insrc/snakemake/jobs.py) is already properly usingawaitand passing theskip_cleanup_outputsset argument. No further updates to call sites are needed.The cleanup logic, including use of
fmt_iofilewithout extra flags, is correct and matches the prior guideline.The type hint mismatch (
Sequence[_IOFile]vs.setat runtime) remains non-breaking but could be cleaned up by relaxing the annotation toCollection[_IOFile]if desired. This is optional and does not block the changes.
|
Please format your code with |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tests/test_update_flag_fail_cleanup/Snakefile(1 hunks)tests/test_update_flag_fail_cleanup/in.txt(1 hunks)tests/tests.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/test_update_flag_fail_cleanup/in.txt
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/tests.py
🧠 Learnings (6)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefiletests/tests.py
📚 Learning: 2024-12-21T15:10:31.992Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 2925
File: tests/test_nonstr_params/Snakefile:14-15
Timestamp: 2024-12-21T15:10:31.992Z
Learning: The file "test.jsonl" in tests/test_nonstr_params is automatically created by Snakemake, rather than manually generated in the Snakefile.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefile
📚 Learning: 2025-01-17T12:00:09.368Z
Learnt from: leoschwarz
Repo: snakemake/snakemake PR: 3176
File: tests/test_output_index.py:99-157
Timestamp: 2025-01-17T12:00:09.368Z
Learning: New test dependencies for Snakemake should be introduced in separate PRs rather than being added as part of feature or refactoring PRs.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefile
📚 Learning: 2025-09-08T12:39:09.146Z
Learnt from: Hocnonsense
Repo: snakemake/snakemake PR: 3714
File: tests/tests.py:1637-1644
Timestamp: 2025-09-08T12:39:09.146Z
Learning: In Snakemake module tests, targeting log-only files like "aaalog" can be an intentional test pattern to validate module behavior and error handling, even though log files aren't typically producible targets in normal workflows.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefile
📚 Learning: 2024-11-07T00:32:44.137Z
Learnt from: mbhall88
Repo: snakemake/snakemake PR: 3188
File: tests/test_script/scripts/test.sh:4-4
Timestamp: 2024-11-07T00:32:44.137Z
Learning: In test scripts within the Snakemake project, concise code is preferred over verbose error handling when simplicity suffices.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefile
📚 Learning: 2025-05-23T09:40:24.474Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3600
File: src/snakemake/jobs.py:960-964
Timestamp: 2025-05-23T09:40:24.474Z
Learning: In the `cleanup` method of the `Job` class in `src/snakemake/jobs.py`, files in the `to_remove` list should be formatted with `fmt_iofile` without specifying `as_output=True` or `as_input=True` parameters, as these files should be displayed as generic files rather than specifically as output files.
Applied to files:
tests/test_update_flag_fail_cleanup/Snakefile
🧬 Code graph analysis (1)
tests/tests.py (1)
tests/common.py (2)
dpath(33-36)run(152-503)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (7, macos-latest, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: apidocs
🔇 Additional comments (1)
tests/test_update_flag_fail_cleanup/Snakefile (1)
1-10: LGTM! Clean test case for update-marked output cleanup on failure.The test workflow correctly exercises the scenario where a job with
update()output fails after modifying the file, with no backup available to restore. The parse-time creation ofin.txtand the deliberate failure (exit 1) after appending to the output appropriately tests the cleanup logic introduced in this PR.Based on learnings
🤖 I have created a release *beep* *boop* --- ## [9.14.0](v9.13.7...v9.14.0) (2025-11-27) ### Features * Support Hy in script directive ([#3824](#3824)) ([2329c9e](2329c9e)) ### Bug Fixes * Add support for pathlib in `notebook` field ([#3811](#3811)) ([7b2180a](7b2180a)) * Addressed race condition in workdir_handler.py ([#3844](#3844)) ([8dbfcfb](8dbfcfb)) * cleanup update-marked output files of failed jobs if there was no backup to restore them ([#3843](#3843)) ([41f1ce8](41f1ce8)) * correct Windows callable path handling ([#3832](#3832)) ([5caad70](5caad70)) * expand env vars on resources ([#3823](#3823)) ([fcfa1bc](fcfa1bc)) * fix backup for output marked by `update` ([#3839](#3839)) ([09c64b7](09c64b7)) * Minor fixes/additions to logging module. ([#3802](#3802)) ([3b3986d](3b3986d)) * mount local storage prefix into containers ([#3840](#3840)) ([f1e8b62](f1e8b62)) * properly format input/output files in case of missing rule to produce them ([#3849](#3849)) ([69d5d24](69d5d24)) * Unpack AnnotatedString in _apply_wildcards ([#3798](#3798)) ([7886508](7886508)) ### Performance Improvements * retrieve storage inputs immediately before scheduling jobs instead of before running the entire workflow ([#3850](#3850)) ([4ac6cda](4ac6cda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
…o backup to restore them (snakemake#3843) <!--Add a description of your PR here--> ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Cleanup now preserves outputs that were restored or explicitly marked to be kept after errors. * Restore-after-update now reports success/failure so restored outputs are correctly retained during cleanup. * Group-level cleanup simplified to rely on per-job cleanup to avoid accidental removal of recovered files. * **Tests** * Added a test validating cleanup behavior after a failed update to ensure updated outputs are removed or preserved as intended. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Felix Mölder <[email protected]>
🤖 I have created a release *beep* *boop* --- ## [9.14.0](snakemake/snakemake@v9.13.7...v9.14.0) (2025-11-27) ### Features * Support Hy in script directive ([snakemake#3824](snakemake#3824)) ([2329c9e](snakemake@2329c9e)) ### Bug Fixes * Add support for pathlib in `notebook` field ([snakemake#3811](snakemake#3811)) ([7b2180a](snakemake@7b2180a)) * Addressed race condition in workdir_handler.py ([snakemake#3844](snakemake#3844)) ([8dbfcfb](snakemake@8dbfcfb)) * cleanup update-marked output files of failed jobs if there was no backup to restore them ([snakemake#3843](snakemake#3843)) ([41f1ce8](snakemake@41f1ce8)) * correct Windows callable path handling ([snakemake#3832](snakemake#3832)) ([5caad70](snakemake@5caad70)) * expand env vars on resources ([snakemake#3823](snakemake#3823)) ([fcfa1bc](snakemake@fcfa1bc)) * fix backup for output marked by `update` ([snakemake#3839](snakemake#3839)) ([09c64b7](snakemake@09c64b7)) * Minor fixes/additions to logging module. ([snakemake#3802](snakemake#3802)) ([3b3986d](snakemake@3b3986d)) * mount local storage prefix into containers ([snakemake#3840](snakemake#3840)) ([f1e8b62](snakemake@f1e8b62)) * properly format input/output files in case of missing rule to produce them ([snakemake#3849](snakemake#3849)) ([69d5d24](snakemake@69d5d24)) * Unpack AnnotatedString in _apply_wildcards ([snakemake#3798](snakemake#3798)) ([7886508](snakemake@7886508)) ### Performance Improvements * retrieve storage inputs immediately before scheduling jobs instead of before running the entire workflow ([snakemake#3850](snakemake#3850)) ([4ac6cda](snakemake@4ac6cda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.