perf: retrieve storage inputs immediately before scheduling jobs instead of before running the entire workflow#3850
Conversation
|
Warning Rate limit exceeded@johanneskoester has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 17 minutes and 14 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded a trivial utility Changes
Sequence Diagram(s)sequenceDiagram
participant Workflow
participant JobScheduler
participant DAG
participant Storage
Workflow->>JobScheduler: schedule jobs / decide execution mode
alt Local job & not dry run & not subprocess
JobScheduler->>DAG: retrieve_storage_inputs(jobs)
DAG->>Storage: fetch inputs (possibly async)
else Remote job & not dry run & SharedFSUsage == LOCAL_COPIES
JobScheduler->>DAG: async retrieve_storage_inputs(jobs)
DAG->>Storage: prefetch inputs to local copies
else Other cases
JobScheduler->>DAG: no prefetch (defer to job execution)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/snakemake/common/__init__.py(1 hunks)src/snakemake/dag.py(4 hunks)src/snakemake/scheduling/job_scheduler.py(3 hunks)src/snakemake/workflow.py(0 hunks)
💤 Files with no reviewable changes (1)
- src/snakemake/workflow.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/common/__init__.pysrc/snakemake/scheduling/job_scheduler.pysrc/snakemake/dag.py
🧠 Learnings (3)
📚 Learning: 2024-10-04T16:12:18.927Z
Learnt from: lczech
Repo: snakemake/snakemake PR: 3113
File: snakemake/scheduler.py:912-914
Timestamp: 2024-10-04T16:12:18.927Z
Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
Applied to files:
src/snakemake/scheduling/job_scheduler.py
📚 Learning: 2025-07-29T14:53:04.598Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.
Applied to files:
src/snakemake/scheduling/job_scheduler.py
📚 Learning: 2024-10-14T09:42:11.571Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3140
File: snakemake/dag.py:1308-1308
Timestamp: 2024-10-14T09:42:11.571Z
Learning: In `snakemake/dag.py`, avoid flagging missing lines or indentation issues when there is no clear syntax or logical error to prevent false positives.
Applied to files:
src/snakemake/dag.py
🧬 Code graph analysis (2)
src/snakemake/scheduling/job_scheduler.py (4)
src/snakemake/settings/types.py (1)
MaxJobsPerTimespan(63-87)src/snakemake/workflow.py (3)
dryrun(419-423)is_main_process(273-274)dag(520-521)src/snakemake/common/__init__.py (1)
async_run(96-113)src/snakemake/dag.py (2)
retrieve_storage_inputs(407-449)jobs(605-607)
src/snakemake/dag.py (1)
src/snakemake/common/__init__.py (1)
func_true(51-52)
🪛 Ruff (0.14.6)
src/snakemake/common/__init__.py
51-51: Unused function argument: job
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, macos-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: apidocs
🔇 Additional comments (6)
src/snakemake/scheduling/job_scheduler.py (1)
34-35: Storage-input retrieval for local and remote jobs is consistent with shared-FS semanticsThe revised scheduling logic looks sound:
- Local jobs: when not a dry run, storage inputs are prefetched via
dag.retrieve_storage_inputs(..., also_missing_internal=True)before invokingself.run, which keeps behavior simple and predictable.- Remote jobs: guarded by
assert self.workflow.is_main_processandSharedFSUsage.STORAGE_LOCAL_COPIES in self.workflow.storage_settings.shared_fs_usage, so prefetching happens only when local copies are actually shared; otherwise remote jobs download inputs themselves.This is consistent with how
SharedFSUsage.STORAGE_LOCAL_COPIESis used elsewhere (e.g. inDAG.cleanup_storage_objects) and doesn’t introduce obvious correctness or performance regressions.Also applies to: 339-378
src/snakemake/dag.py (5)
38-44: Reusingfunc_trueandis_local_filefromsnakemake.commonlooks goodImporting
func_trueforis_external_inputandis_local_fileforget_sourcescentralizes these helpers insnakemake.commonand keeps behavior consistent across modules. No issues spotted with these imports or their usages.
163-165: Storage-input job bookkeeping (_storage_input_jobs) is coherent with cleanup logicThe combination of:
- Initializing
self._storage_input_jobs = defaultdict(list)in__init__,- Populating it in
update_storage_inputs()fromneedrun_jobs(), and- Using it in
handle_storage()to remove local storage copies only after all jobs that use a given storage input have finished,is internally consistent and respects the “don’t drop inputs while other jobs still need them” invariant.
The call to
update_storage_inputs()frompostprocess()ensures this mapping is refreshed after DAG updates, so the cleanup decision inhandle_storage()remains accurate throughout the run.Also applies to: 451-457, 1070-1093
3069-3083:is_external_input’s predicate refactor matches the documented behaviorThe refactored implementation:
- Uses
consider = func_trueby default, so a file is non-external if any dependency produces it, regardless of itsneedrunstatus — matching “may not be created by any other job to be considered external.”- Switches to
consider = self.needrunwhennot_needrun_is_external=True, so only producers that are stillneedrunare treated as internal; files produced solely by non-needrunjobs are treated as external, as described in the docstring.This aligns both default usage (e.g. in
archive()) and the special-case behavior used byretrieve_storage_inputs.
1907-1913: Callingupdate_storage_inputs()duringpostprocesskeeps storage cleanup decisions up to dateIncluding
self.update_storage_inputs()at the end ofpostprocessensures_storage_input_jobsis recomputed after each DAG update (e.g. from checkpoints or queue-input updates) before new jobs start to run. That way,handle_storage()always bases its “all other jobs finished” check on the currentneedrunset rather than a stale snapshot.This is the right place in the lifecycle to refresh that mapping; no issues here.
407-424: All call sites ofretrieve_storage_inputs()pass thejobsargument explicitly—verification confirms no breaking changes.Both call sites in
src/snakemake/scheduling/job_scheduler.py(lines 352 and 374) already passjobs=...as required by the updated signature. No legacy calls without arguments exist in the codebase.
| def func_true(job): | ||
| return True |
There was a problem hiding this comment.
Suppress Ruff ARG001 by marking the job argument intentionally unused
func_true needs to accept a job for signature compatibility but does not use it, which triggers Ruff ARG001. Renaming the parameter makes the intent explicit and should silence the warning:
-def func_true(job):
- return True
+def func_true(_job):
+ return True📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def func_true(job): | |
| return True | |
| def func_true(_job): | |
| return True |
🧰 Tools
🪛 Ruff (0.14.6)
51-51: Unused function argument: job
(ARG001)
🤖 Prompt for AI Agents
In src/snakemake/common/__init__.py around lines 51 to 52, func_true currently
takes a parameter named job but does not use it, triggering Ruff ARG001; rename
the parameter to _job (or _ ) to signal it is intentionally unused while
preserving the function signature, and update any local references or callers if
they rely on the parameter name so the change is safe.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/snakemake/scheduling/job_scheduler.py (1)
34-35: Scheduler-side storage prefetching wiring looks sound;also_missing_internalusage could be simplifiedThe integration here is coherent:
- Local jobs: prefetching storage inputs only when not in a dry run and not in
subprocess_execavoids unnecessary IO in dry-run and subprocess contexts while still ensuring local jobs see their inputs.- Remote jobs: guarding retrieval with
SharedFSUsage.STORAGE_LOCAL_COPIES in shared_fs_usageand asserting execution in the main process lines up with the intent that only a shared FS should host pre-fetched local copies; otherwise, remote workers continue to fetch from storage themselves.One minor semantic nit:
retrieve_storage_inputs()is always called withalso_missing_internal=True, which means it will prefetch all storage-backed inputs for the scheduled jobs, and the newis_external_input(..., not_needrun_is_external=True)branch is effectively unused. If the long‑term intent is to distinguish “true external” inputs from internal ones for prefetching, consider either:
- Passing
also_missing_internal=Falsewhere you only want external inputs, or- Dropping/simplifying the
also_missing_internalparameter if you never plan to call it withFalse.This is about API clarity and avoiding dead branching rather than a functional bug.
Also applies to: 349-355, 362-377
src/snakemake/dag.py (1)
38-44: Batch-scoped storage-input retrieval andis_external_inputsemantics look consistent; note change tois_ondemand_eligibleheuristic
The added
func_trueimport and its use inis_external_input()give a clear default (“consider all deps”) while allowing thenot_needrun_is_externalflag to narrow producers toneedrunjobs only. The new docstring accurately reflects the behavior.
retrieve_storage_inputs(self, jobs, also_missing_internal=False)now:
- Works on an explicit
jobssubset rather than the whole DAG.- Uses
is_external_input(f, job, not_needrun_is_external=True)only whenalso_missing_internalisFalse; otherwise it simply prefetches all storage inputs for the given jobs.- Computes
file_access_patternsonly over thatjobssubset and setsf.storage_object.is_ondemand_eligibleaccordingly.The last point slightly changes the meaning of the comment about “multiple jobs” accessing the same file: when the same storage file is consumed by jobs that end up in different scheduling rounds, no single call will ever see more than one consumer, so
len(file_access_patterns)can remain1and the file will often be markedis_ondemand_eligible=Trueeven though it is reused across multiple jobs.Given
is_ondemand_eligibleis just a performance hint to storage plugins, this is harmless from a correctness perspective, but it does weaken the global heuristic. If you still want to approximate “multiple jobs across the whole DAG”, you could consider augmenting this hint with information fromself._storage_input_jobsor another precomputed aggregate; otherwise, it may be worth updating the comment to reflect that the heuristic is now batch-local.Overall, the refactor of
retrieve_storage_inputsandis_external_inputis internally consistent and aligns with the new scheduler-side usage.Also applies to: 407-449, 3069-3083
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/snakemake/dag.py(5 hunks)src/snakemake/scheduling/job_scheduler.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/scheduling/job_scheduler.pysrc/snakemake/dag.py
🧠 Learnings (3)
📚 Learning: 2024-10-04T16:12:18.927Z
Learnt from: lczech
Repo: snakemake/snakemake PR: 3113
File: snakemake/scheduler.py:912-914
Timestamp: 2024-10-04T16:12:18.927Z
Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
Applied to files:
src/snakemake/scheduling/job_scheduler.py
📚 Learning: 2025-07-29T14:53:04.598Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3676
File: src/snakemake/cli.py:0-0
Timestamp: 2025-07-29T14:53:04.598Z
Learning: In the Snakemake repository, deprecated scheduler interfaces (like scheduler_ilp_solver, --scheduler-solver-path, --scheduler-greediness) should be kept for backward compatibility during the migration to the plugin-based scheduler system, rather than being removed immediately.
Applied to files:
src/snakemake/scheduling/job_scheduler.py
📚 Learning: 2024-10-14T09:42:11.571Z
Learnt from: johanneskoester
Repo: snakemake/snakemake PR: 3140
File: snakemake/dag.py:1308-1308
Timestamp: 2024-10-14T09:42:11.571Z
Learning: In `snakemake/dag.py`, avoid flagging missing lines or indentation issues when there is no clear syntax or logical error to prevent false positives.
Applied to files:
src/snakemake/dag.py
🧬 Code graph analysis (2)
src/snakemake/scheduling/job_scheduler.py (4)
src/snakemake/settings/types.py (1)
MaxJobsPerTimespan(63-87)src/snakemake/workflow.py (4)
dryrun(419-423)subprocess_exec(462-463)is_main_process(273-274)dag(520-521)src/snakemake/common/__init__.py (1)
async_run(96-113)src/snakemake/dag.py (2)
retrieve_storage_inputs(407-449)jobs(605-607)
src/snakemake/dag.py (1)
src/snakemake/common/__init__.py (1)
func_true(51-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (45)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (8, macos-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: tests (1, ubuntu-latest, py311)
🔇 Additional comments (1)
src/snakemake/dag.py (1)
1075-1080: Avoiding cleanup of target storage outputs is a good safeguardExtending
unneeded_files()inhandle_storage()to skip files that are inself.targetfilesbefore removing local storage copies tightens correctness wrt user expectations: final targets aren’t pruned locally just because no further jobs depend on them. This aligns well with typical “target files should remain available” semantics and doesn’t introduce any obvious edge cases given existing uses ofself.targetfiles.
🤖 I have created a release *beep* *boop* --- ## [9.14.0](v9.13.7...v9.14.0) (2025-11-27) ### Features * Support Hy in script directive ([#3824](#3824)) ([2329c9e](2329c9e)) ### Bug Fixes * Add support for pathlib in `notebook` field ([#3811](#3811)) ([7b2180a](7b2180a)) * Addressed race condition in workdir_handler.py ([#3844](#3844)) ([8dbfcfb](8dbfcfb)) * cleanup update-marked output files of failed jobs if there was no backup to restore them ([#3843](#3843)) ([41f1ce8](41f1ce8)) * correct Windows callable path handling ([#3832](#3832)) ([5caad70](5caad70)) * expand env vars on resources ([#3823](#3823)) ([fcfa1bc](fcfa1bc)) * fix backup for output marked by `update` ([#3839](#3839)) ([09c64b7](09c64b7)) * Minor fixes/additions to logging module. ([#3802](#3802)) ([3b3986d](3b3986d)) * mount local storage prefix into containers ([#3840](#3840)) ([f1e8b62](f1e8b62)) * properly format input/output files in case of missing rule to produce them ([#3849](#3849)) ([69d5d24](69d5d24)) * Unpack AnnotatedString in _apply_wildcards ([#3798](#3798)) ([7886508](7886508)) ### Performance Improvements * retrieve storage inputs immediately before scheduling jobs instead of before running the entire workflow ([#3850](#3850)) ([4ac6cda](4ac6cda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
|
Hi @johanneskoester , I'm not sure why, but the following doesn't work for me on Snakemake 9.14.0, Slurm plugin 2.0.0: I get an See errorThe changes in this pull request may have something to do with it, so I thought I would ask for help here. Is there any way to fix this problem? |
…ead of before running the entire workflow (snakemake#3850) This saves local disk space. After each job is finished, Snakemake already ensures that local storage copies are deleted once not needed anymore. Hence, in combination with this PR, the lifetime of local copies on disk will be minimized. ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Refactor** * Simplified and unified storage-input retrieval flow, changing when preflight retrieval runs and clarifying external-input determination. * **New Features** * Asynchronous prefetching of storage inputs for remote jobs when local copies are configured. * **Style** * Consistent file I/O formatting and logging output across job reporting. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
🤖 I have created a release *beep* *boop* --- ## [9.14.0](snakemake/snakemake@v9.13.7...v9.14.0) (2025-11-27) ### Features * Support Hy in script directive ([snakemake#3824](snakemake#3824)) ([2329c9e](snakemake@2329c9e)) ### Bug Fixes * Add support for pathlib in `notebook` field ([snakemake#3811](snakemake#3811)) ([7b2180a](snakemake@7b2180a)) * Addressed race condition in workdir_handler.py ([snakemake#3844](snakemake#3844)) ([8dbfcfb](snakemake@8dbfcfb)) * cleanup update-marked output files of failed jobs if there was no backup to restore them ([snakemake#3843](snakemake#3843)) ([41f1ce8](snakemake@41f1ce8)) * correct Windows callable path handling ([snakemake#3832](snakemake#3832)) ([5caad70](snakemake@5caad70)) * expand env vars on resources ([snakemake#3823](snakemake#3823)) ([fcfa1bc](snakemake@fcfa1bc)) * fix backup for output marked by `update` ([snakemake#3839](snakemake#3839)) ([09c64b7](snakemake@09c64b7)) * Minor fixes/additions to logging module. ([snakemake#3802](snakemake#3802)) ([3b3986d](snakemake@3b3986d)) * mount local storage prefix into containers ([snakemake#3840](snakemake#3840)) ([f1e8b62](snakemake@f1e8b62)) * properly format input/output files in case of missing rule to produce them ([snakemake#3849](snakemake#3849)) ([69d5d24](snakemake@69d5d24)) * Unpack AnnotatedString in _apply_wildcards ([snakemake#3798](snakemake#3798)) ([7886508](snakemake@7886508)) ### Performance Improvements * retrieve storage inputs immediately before scheduling jobs instead of before running the entire workflow ([snakemake#3850](snakemake#3850)) ([4ac6cda](snakemake@4ac6cda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This saves local disk space. After each job is finished, Snakemake already ensures that local storage copies are deleted once not needed anymore. Hence, in combination with this PR, the lifetime of local copies on disk will be minimized.
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Refactor
New Features
Style
✏️ Tip: You can customize this high-level summary in your review settings.