fix: cache wrapper files and wait for them in case of shared filesystem for sources#3809
fix: cache wrapper files and wait for them in case of shared filesystem for sources#3809johanneskoester merged 5 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdded optional skipping of input files in Job.get_wait_for_files and extended wait-listing for wrapper jobs to include wrapper script and conda env files. Wired a new runtime_source_cache_path through CLI, WorkflowSettings, SourceCache, spawn_jobs, and workflow initialization. Added qsub wrapper test and script; small wrapper error-message tweak. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller as Caller
participant Job as Job.get_wait_for_files
participant Wrapper as snakemake.wrapper
participant Group as GroupJob.get_wait_for_files
Caller->>Job: call get_wait_for_files(skip_input_files)
alt skip_input_files == False
Job->>Job: include local & storage inputs
else skip_input_files == True
Job->>Job: skip input files
end
alt is_wrapper == True
Job->>Wrapper: resolve script_path & conda_env_path
Wrapper-->>Job: script_path, conda_env_path
Job->>Job: append IOFile(script_path), IOFile(conda_env_path)
end
Job-->>Caller: return wait-for-files list
Note right of Group: aggregates child jobs with inputs skipped
Caller->>Group: call get_wait_for_files()
Group->>Job: job.get_wait_for_files(skip_input_files=True) [for each child]
Job-->>Group: child wait lists
Group-->>Caller: combined wait-for-files list
sequenceDiagram
autonumber
participant CLI as CLI
participant Settings as WorkflowSettings
participant Workflow as Workflow.__init__/post_init
participant SourceCache as SourceCache
CLI->>Settings: set runtime_source_cache_path (from --runtime-source-cache-path)
Settings-->>Workflow: workflow_settings contains runtime_source_cache_path
Workflow->>SourceCache: init SourceCache(cache_path, runtime_cache_path=workflow_settings.runtime_source_cache_path)
SourceCache-->>Workflow: SourceCache instance (aware of runtime path)
Workflow->>Spawn: spawn_jobs receives sourcecache.runtime_cache_path (via spawn args)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Please format your code with |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/snakemake/wrapper.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
src/snakemake/wrapper.py
🧬 Code graph analysis (1)
src/snakemake/wrapper.py (1)
src/snakemake/shell.py (1)
prefix(133-134)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (9, macos-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (4, macos-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (1, macos-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (1, ubuntu-latest, py312)
- GitHub Check: apidocs
| if script_source is None: | ||
| raise WorkflowError( | ||
| f"Unable to locate wrapper script for wrapper {path}. " | ||
| f"Unable to locate wrapper script at {prefix}{path}. " |
There was a problem hiding this comment.
Handle None prefix in error message.
The error message improvement is good, but if prefix is None, it will display "at None{path}". The get_path() function defaults None to PREFIX (lines 28-29), but this doesn't affect the prefix variable here.
Apply this diff to handle the None case consistently:
- f"Unable to locate wrapper script at {prefix}{path}. "
+ f"Unable to locate wrapper script at {prefix if prefix is not None else PREFIX}{path}. "📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f"Unable to locate wrapper script at {prefix}{path}. " | |
| f"Unable to locate wrapper script at {prefix if prefix is not None else PREFIX}{path}. " |
🤖 Prompt for AI Agents
In src/snakemake/wrapper.py around line 107, the error message concatenates
prefix directly and can render "None{path}" if prefix is None; coerce prefix to
the same default used by get_path (e.g., PREFIX or an empty string) before
formatting the message so None is not shown. Update the code to set prefix =
prefix if prefix is not None else PREFIX (or call get_path() to obtain the
resolved prefix) and use that variable in the f-string.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
tests/test_wrapper/qsub (1)
1-6: Harden the test wrapper: quote args and use bash consistently.
- Quote "$1" to avoid globbing/word-splitting.
- Use bash to run the jobscript (shebang is bash).
- Add strict mode to fail fast.
Apply:
#!/bin/bash -echo `date` >> qsub.log -tail -n1 $1 >> qsub.log +set -euo pipefail +printf '%s\n' "$(date)" >> qsub.log +tail -n1 -- "$1" >> qsub.log # simulate printing of job id by a random number -echo $RANDOM -sh $1 +echo "$RANDOM" +bash "$1"src/snakemake/settings/types.py (1)
136-137: LGTM: new WorkflowSettings runtime_source_cache_path.Matches downstream usage and defaulting semantics. Consider documenting it's for internal plumbing to avoid user confusion.
src/snakemake/cli.py (1)
1349-1353: New flag placement OK; optionally move to INTERNAL group.Flag is marked “internal use” but sits in BEHAVIOR. Optional: move to the INTERNAL group for help clarity.
src/snakemake/spawn_jobs.py (1)
277-279: Also gate on spawned_jobs_assume_shared_fs to ensure consistency.If an executor assumes shared FS for spawned jobs, we should pass the runtime source cache path regardless of the local shared_fs_usage. Align the gating with get_shared_fs_usage_arg.
Apply:
- shared_source_cache = ( - SharedFSUsage.SOURCE_CACHE in self.workflow.storage_settings.shared_fs_usage - ) + if executor_common_settings.spawned_jobs_assume_shared_fs: + shared_source_cache = True + else: + shared_source_cache = ( + SharedFSUsage.SOURCE_CACHE + in self.workflow.storage_settings.shared_fs_usage + )Optional: base64-encode the path like other path args if you foresee spaces/odd chars:
- format_cli_arg( + format_cli_arg( "--runtime-source-cache-path", self.workflow.sourcecache.runtime_cache_path, - skip=not shared_source_cache, + skip=not shared_source_cache, + # base64_encode=True, # optional ),Also applies to: 351-355
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/snakemake/cli.py(2 hunks)src/snakemake/settings/types.py(1 hunks)src/snakemake/sourcecache.py(1 hunks)src/snakemake/spawn_jobs.py(2 hunks)src/snakemake/workflow.py(1 hunks)tests/test_wrapper/qsub(1 hunks)tests/tests_using_conda.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
src/snakemake/settings/types.pysrc/snakemake/spawn_jobs.pysrc/snakemake/workflow.pysrc/snakemake/sourcecache.pysrc/snakemake/cli.pytests/tests_using_conda.py
🧬 Code graph analysis (3)
src/snakemake/spawn_jobs.py (3)
src/snakemake/api.py (1)
workflow(114-168)src/snakemake/workflow.py (1)
sourcecache(500-501)src/snakemake/sourcecache.py (1)
runtime_cache_path(370-371)
src/snakemake/workflow.py (1)
src/snakemake/sourcecache.py (2)
SourceCache(349-458)runtime_cache_path(370-371)
tests/tests_using_conda.py (3)
src/snakemake/workflow.py (2)
conda(2148-2153)run(2346-2347)src/snakemake/deployment/conda.py (1)
conda(114-115)tests/common.py (1)
dpath(33-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (45)
- GitHub Check: tests (10, ubuntu-latest, py313)
- GitHub Check: tests (10, ubuntu-latest, py311)
- GitHub Check: tests (10, ubuntu-latest, py312)
- GitHub Check: tests (9, ubuntu-latest, py312)
- GitHub Check: tests (7, ubuntu-latest, py311)
- GitHub Check: tests (8, ubuntu-latest, py311)
- GitHub Check: tests (10, macos-latest, py313)
- GitHub Check: tests (7, windows-2022, py313)
- GitHub Check: tests (10, windows-2022, py313)
- GitHub Check: tests (8, ubuntu-latest, py312)
- GitHub Check: tests (6, windows-2022, py313)
- GitHub Check: tests (7, ubuntu-latest, py313)
- GitHub Check: tests (6, ubuntu-latest, py313)
- GitHub Check: tests (9, windows-2022, py313)
- GitHub Check: tests (9, ubuntu-latest, py311)
- GitHub Check: tests (6, ubuntu-latest, py311)
- GitHub Check: tests (7, ubuntu-latest, py312)
- GitHub Check: tests (6, ubuntu-latest, py312)
- GitHub Check: tests (6, macos-latest, py313)
- GitHub Check: tests (9, ubuntu-latest, py313)
- GitHub Check: tests (8, ubuntu-latest, py313)
- GitHub Check: tests (8, windows-2022, py313)
- GitHub Check: tests (5, macos-latest, py313)
- GitHub Check: tests (3, macos-latest, py313)
- GitHub Check: tests (4, windows-2022, py313)
- GitHub Check: tests (5, windows-2022, py313)
- GitHub Check: tests (1, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py313)
- GitHub Check: tests (3, ubuntu-latest, py312)
- GitHub Check: tests (3, windows-2022, py313)
- GitHub Check: tests (5, ubuntu-latest, py311)
- GitHub Check: tests (5, ubuntu-latest, py312)
- GitHub Check: tests (4, ubuntu-latest, py311)
- GitHub Check: tests (2, windows-2022, py313)
- GitHub Check: tests (4, ubuntu-latest, py313)
- GitHub Check: tests (4, ubuntu-latest, py312)
- GitHub Check: tests (3, ubuntu-latest, py311)
- GitHub Check: tests (3, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py311)
- GitHub Check: tests (2, ubuntu-latest, py312)
- GitHub Check: tests (2, macos-latest, py313)
- GitHub Check: tests (2, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py313)
- GitHub Check: tests (1, ubuntu-latest, py312)
🔇 Additional comments (5)
src/snakemake/workflow.py (1)
211-214: LGTM: SourceCache now respects runtime cache path.Wiring looks correct; SourceCache will create a TemporaryDirectory when unset and use the provided path otherwise.
tests/tests_using_conda.py (1)
137-146: LGTM: qsub wrapper test adds valuable coverage.Mirrors existing wrapper test with cluster submission; no issues spotted.
src/snakemake/cli.py (1)
2066-2071: LGTM: value flows into WorkflowSettings.Correctly threads args.runtime_source_cache_path into WorkflowSettings.
src/snakemake/sourcecache.py (1)
354-367: LGTM: optional runtime cache path is handled correctly.Constructor and property semantics make sense; lazy directory creation via cache_entry.parent.mkdir ensures path exists when needed.
src/snakemake/spawn_jobs.py (1)
381-383: All SourceCache instantiations are correctly updated. Verification confirms all 4 SourceCache constructor callsites in the codebase properly pass theruntime_cache_pathparameter (wrapper.py, workflow.py, script/init.py, notebook.py). Debug logging of args is acceptable as you noted; no blocking concerns found.
🤖 I have created a release *beep* *boop* --- ## [9.13.5](v9.13.4...v9.13.5) (2025-11-04) ### Bug Fixes * cache wrapper files and wait for them in case of shared filesystem for sources ([#3809](#3809)) ([498fff7](498fff7)) * correctly handle meta-wrapper tag replacement depending on the used snakemake-wrapper release ([#3826](#3826)) ([8d46a4a](8d46a4a)) * ensure that flags are properly considered for input files before applying path modifiers (i.e. default storage providers) ([#3813](#3813)) ([d3bfe32](d3bfe32)) * ensure that tokens are not leaked when paths or uris of source files are logged ([#3821](#3821)) ([a217e50](a217e50)) * print secs as numeric in jsonl benchmark format ([#3814](#3814)) ([395a5e6](395a5e6)) * revert breaking change in 9.11.9 disallowing empty input files even when unused ([#3810](#3810)) ([83c05cc](83c05cc)) * shorten report ids (thus dir names) in order to avoid issues with path length on windows ([#3822](#3822)) ([b24d971](b24d971)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
…em for sources (snakemake#3809) ### Description <!--Add a description of your PR here--> ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Refactor** * Improved job dependency handling with an option to skip input files when building wait lists. * Jobs now ensure wrapper scripts and their runtime environments are available before running. * Grouped-job dependency resolution now aggregates per-job wait lists consistently (skipping inputs where appropriate). * **New Features** * Added a CLI option to specify a runtime source-cache path and wired it into workflow settings. * **Tests** * Added a qsub wrapper script and corresponding test to broaden wrapper coverage. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
🤖 I have created a release *beep* *boop* --- ## [9.13.5](snakemake/snakemake@v9.13.4...v9.13.5) (2025-11-04) ### Bug Fixes * cache wrapper files and wait for them in case of shared filesystem for sources ([snakemake#3809](snakemake#3809)) ([498fff7](snakemake@498fff7)) * correctly handle meta-wrapper tag replacement depending on the used snakemake-wrapper release ([snakemake#3826](snakemake#3826)) ([8d46a4a](snakemake@8d46a4a)) * ensure that flags are properly considered for input files before applying path modifiers (i.e. default storage providers) ([snakemake#3813](snakemake#3813)) ([d3bfe32](snakemake@d3bfe32)) * ensure that tokens are not leaked when paths or uris of source files are logged ([snakemake#3821](snakemake#3821)) ([a217e50](snakemake@a217e50)) * print secs as numeric in jsonl benchmark format ([snakemake#3814](snakemake#3814)) ([395a5e6](snakemake@395a5e6)) * revert breaking change in 9.11.9 disallowing empty input files even when unused ([snakemake#3810](snakemake#3810)) ([83c05cc](snakemake@83c05cc)) * shorten report ids (thus dir names) in order to avoid issues with path length on windows ([snakemake#3822](snakemake#3822)) ([b24d971](snakemake@b24d971)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Description
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Refactor
New Features
Tests