Skip to content

Commit bdcd36c

Browse files
Merge branch 'main' into feat/issue-3395
2 parents a1e9c25 + 7c6f232 commit bdcd36c

File tree

48 files changed

+602
-159
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+602
-159
lines changed

CHANGELOG.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,34 @@
11
# Changelog
22

33

4+
## [9.1.0](https://github.com/snakemake/snakemake/compare/v9.0.1...v9.1.0) (2025-03-21)
5+
6+
7+
### Features
8+
9+
* adapt to changes in storage plugin interface, now passing the logger to the storage provider ([#3460](https://github.com/snakemake/snakemake/issues/3460)) ([ac34f11](https://github.com/snakemake/snakemake/commit/ac34f11e704730c0bf0b9a89adcfb8909fb0baf0))
10+
* finalized support for access pattern annotation (sequential, random, multi), allowing storage plugins to decide about the most efficient provisioning approach (e.g. mounting vs. downloading a local copy) ([#3461](https://github.com/snakemake/snakemake/issues/3461)) ([871c5ab](https://github.com/snakemake/snakemake/commit/871c5ab83def456f7f239f212080bcef447dc08b))
11+
* introduce ability to annotate access pattern (will e.g. be usable for optimizations in storage plugins) ([#3459](https://github.com/snakemake/snakemake/issues/3459)) ([6e5e65b](https://github.com/snakemake/snakemake/commit/6e5e65b4184b842df1519469f62e847fc52c2d20))
12+
13+
14+
### Bug Fixes
15+
16+
* only warn upon positional parameter overwrite with "use rule" in case the number of positional parameters changes ([#3457](https://github.com/snakemake/snakemake/issues/3457)) ([ec18f98](https://github.com/snakemake/snakemake/commit/ec18f98fd94e6133980b4575f932be841e3d2dbf))
17+
* setup logger for non-executing subcommands ([de5c7a3](https://github.com/snakemake/snakemake/commit/de5c7a35a5d439c073194cf8c30984463180a471))
18+
19+
20+
### Documentation
21+
22+
* Fix indentation of datavzrd configuration of tutorial ([#3465](https://github.com/snakemake/snakemake/issues/3465)) ([96ec187](https://github.com/snakemake/snakemake/commit/96ec18700e3616794c4da72937e879f7e789b8e7))
23+
* fix profiles linking, add Sphinx rst syntax infos ([#3453](https://github.com/snakemake/snakemake/issues/3453)) ([bbb0284](https://github.com/snakemake/snakemake/commit/bbb0284347351b46e8f05679a54aa1e262402e1d))
24+
25+
## [9.0.1](https://github.com/snakemake/snakemake/compare/v9.0.0...v9.0.1) (2025-03-14)
26+
27+
28+
### Bug Fixes
29+
30+
* group job error call ([#3448](https://github.com/snakemake/snakemake/issues/3448)) ([3d53863](https://github.com/snakemake/snakemake/commit/3d5386393b0f580c5e44819ffddfe13562dc248d))
31+
432
## [9.0.0](https://github.com/snakemake/snakemake/compare/v8.30.0...v9.0.0) (2025-03-14)
533

634

@@ -11,7 +39,7 @@
1139
### Features
1240

1341
* [#3412](https://github.com/snakemake/snakemake/issues/3412) - keep shadow folder of failed job if --keep-incomplete flag is set. ([#3430](https://github.com/snakemake/snakemake/issues/3430)) ([22978c3](https://github.com/snakemake/snakemake/commit/22978c3a9479d0f0a94f33ea74e91ce06f83d2d7))
14-
* add flag --report-after-run to automatically generate the report after a successfull workflow run ([#3428](https://github.com/snakemake/snakemake/issues/3428)) ([b0a7f03](https://github.com/snakemake/snakemake/commit/b0a7f03e824beae5985e542d80d46c3d75bfc823))
42+
* add flag --report-after-run to automatically generate the report after a successful workflow run ([#3428](https://github.com/snakemake/snakemake/issues/3428)) ([b0a7f03](https://github.com/snakemake/snakemake/commit/b0a7f03e824beae5985e542d80d46c3d75bfc823))
1543
* add flatten function to IO utils ([#3424](https://github.com/snakemake/snakemake/issues/3424)) ([67fa392](https://github.com/snakemake/snakemake/commit/67fa392c1eedab2c7b0aaa5c38ac1d9403912497))
1644
* add helper functions to parse input files ([#2918](https://github.com/snakemake/snakemake/issues/2918)) ([63e45a7](https://github.com/snakemake/snakemake/commit/63e45a70ae57bc46b345c69b7c2f18d7c811b176))
1745
* Add option to print redacted file names ([#3089](https://github.com/snakemake/snakemake/issues/3089)) ([ba4d264](https://github.com/snakemake/snakemake/commit/ba4d2644aab18b43a8704e883f48c428d8b35d5a))

docs/executing/cli.rst

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,11 @@ Non-local execution
7373
^^^^^^^^^^^^^^^^^^^
7474

7575
Non-local execution on cluster or cloud infrastructure is implemented via plugins.
76-
The `Snakemake plugin catalog <https://snakemake.github.io/snakemake-plugin-catalog>`_ lists available plugins and their documentation.
76+
The `Snakemake plugin catalog <https://snakemake.github.io/snakemake-plugin-catalog>`__ lists available plugins and their documentation.
77+
In general, the configuration boils down to specifying an executor plugin (e.g. for SLURM or Kubernetes) and, if needed, a :ref:`storage <default_storage>` plugin (e.g. in order to use S3 for input and output files or in order to efficiently use a shared network filesystem).
78+
For maximizing the I/O performance over the network, it can be advisable to :ref:`annotate the input file access patterns of rules <storage-access-patterns>`.
79+
Snakemake provides lots of tunables for non-local execution, which can all be found under :ref:`all_options` and in the plugin descriptions of the `Snakemake plugin catalog <https://snakemake.github.io/snakemake-plugin-catalog>`__.
80+
In any case, the cluster or cloud specific configuration will entail lots of command line options to be chosen and set, which should be persisted in a :ref:`profile <executing-profiles>`.
7781

7882
Dealing with very large workflows
7983
---------------------------------
@@ -106,7 +110,7 @@ Snakemake will process beyond the rule ``myrule``, because all of its input file
106110
Obviously, a good choice of the rule to perform the batching is a rule that has a lot of input files and upstream jobs, for example a central aggregation step within your workflow.
107111
We advice all workflow developers to inform potential users of the best suited batching rule.
108112

109-
.. _profiles:
113+
.. _executing-profiles:
110114

111115
--------
112116
Profiles

docs/getting_started/migration.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@ Sometimes, new features are added that do not require, but make it strongly advi
1111

1212
Below are migration hints for particular Snakemake versions.
1313

14+
Migrating to Snakemake 9
15+
------------------------
16+
17+
Between Snakemake 8 and Snakemake 9, there is only a single breaking change in how custom loggers are provided, such that hardly any user should be affected.
18+
The new way to specify custom log handlers is specifying a logger plugin via ``--logger`` or ``OutputSettings.log_handler_settings`` in the API.
19+
1420
Migrating to Snakemake 8
1521
------------------------
1622

@@ -571,7 +577,7 @@ Profiles
571577
^^^^^^^^
572578

573579
Profiles can now be versioned.
574-
If your profile makes use of settings that are available in version 8 or later, use the filename ``config.v8+.yaml`` for the profile configuration (see :ref:`profiles <profiles>`).
580+
If your profile makes use of settings that are available in version 8 or later, use the filename ``config.v8+.yaml`` for the profile configuration (see :ref:`executing-profiles`).
575581

576582
API
577583
^^^

docs/project_info/contributing.rst

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,10 @@ Write Documentation
5656

5757
Snakemake could always use more documentation, whether as part of the official docs, in docstrings, or even on the web in blog posts, articles, and such.
5858

59-
Snakemake uses `Sphinx <https://sphinx-doc.org>`_ for the user manual (that you are currently reading).
60-
See :ref:`project_info-doc_guidelines` on how the documentation reStructuredText is used.
59+
.. _Sphinx: https://sphinx-doc.org
60+
61+
Snakemake uses `Sphinx`_ for the user manual (that you are currently reading).
62+
See :ref:`project_info-doc_guidelines` on how the reStructuredText is used for the documentation.
6163

6264

6365

@@ -250,11 +252,16 @@ The existing unit tests should all cope with this, and in general you should avo
250252
Documentation Guidelines
251253
========================
252254

255+
The documentation uses `Sphinx`_ and is written in ``reStructuredText``.
256+
For details on the syntax, see the `Sphinx primer on reStructuredText <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#rst-primer>`_ and the `Sphinx documentation on cross-references <https://www.sphinx-doc.org/en/master/usage/referencing.html>`_.
257+
253258
For the documentation, please adhere to the following guidelines:
254259

255260
- Put each sentence on its own line, this makes tracking changes through Git SCM easier.
256-
- Provide hyperlink targets, at least for the first two section levels.
257-
For this, use the format ``<document_part>-<section_name>``, e.g., ``project_info-doc_guidelines``.
261+
- Provide `hyperlink targets <https://www.sphinx-doc.org/en/master/usage/referencing.html#cross-referencing-arbitrary-locations>`_, at least for the first two section levels.
262+
For this, use the format ``<document_part>-<section_name>``, for example ``project_info-doc_guidelines`` for the current section.
263+
Set the hyperlink target right above the section heading with ``.. _project_info-doc_guidelines:``.
264+
Reference the hyperlink (i.e. link to it) with ``:ref:`project_info-doc_guidelines```.
258265
- Use the `section structure recommended by Sphinx <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#sections>`_, which references the `recommendations in the Python Developer's Guide <https://devguide.python.org/documentation/markup/#sections>`_.
259266
Namely, the levels are:
260267

docs/snakefiles/configuration.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -226,4 +226,4 @@ Usually, it is preferred to only set the working directory via the command line,
226226
Cluster Configuration (not supported anymore)
227227
---------------------------------------------
228228

229-
The previously supported cluster configuration has been replaced by configuration profiles (see :ref:`profiles`).
229+
The previously supported cluster configuration has been replaced by configuration profiles (see :ref:`executing-profiles`).

docs/snakefiles/rules.rst

Lines changed: 36 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ However, rules can be much more complex, may use :ref:`plain python <snakefiles-
2323

2424
Inside the shell command, all local and global variables, especially input and output files can be accessed via their names in the `python format minilanguage <https://docs.python.org/py3k/library/string.html#formatspec>`_.
2525
Here, input and output (and in general any list or tuple) automatically evaluate to a space-separated list of files (i.e. ``path/to/inputfile path/to/other/inputfile``).
26-
From Snakemake 3.8.0 on, adding the special formatting instruction ``:q`` (e.g. ``"somecommand {input:q} {output:q}")``) will let Snakemake quote each of the list or tuple elements that contains whitespace.
26+
From Snakemake 3.8.0 on, adding the special formatting instruction ``:q`` (e.g. ``"somecommand {input:q} {output:q}"``) will let Snakemake quote each of the list or tuple elements that contains whitespace.
2727

2828
.. note::
2929

@@ -842,7 +842,7 @@ Snakemake will always round the calculated value down (while enforcing a minimum
842842

843843
Starting from version 3.7, threads can also be a callable that returns an ``int`` value. The signature of the callable should be ``callable(wildcards[, input])`` (input is an optional parameter). It is also possible to refer to a predefined variable (e.g, ``threads: threads_max``) so that the number of cores for a set of rules can be changed with one change only by altering the value of the variable ``threads_max``.
844844

845-
Both threads can be defined (or overwritten) upon invocation (without modifying the workflow code) via `--set-threads` see :ref:`all_options` and via workflow profiles, see :ref:`profiles`.
845+
Both threads can be defined (or overwritten) upon invocation (without modifying the workflow code) via `--set-threads` see :ref:`all_options` and via workflow profiles, see :ref:`executing-profiles`.
846846
To quickly exemplify the latter, you could provide the following workflow profile in a file ``profiles/default/config.yaml`` relative to the Snakefile or the current working directory:
847847

848848
.. code-block:: yaml
@@ -957,7 +957,7 @@ Here, the value that the function ``get_mem_mb`` returns, grows linearly with th
957957
Of course, any other arithmetic could be performed in that function.
958958

959959
Both threads and resources can be defined (or overwritten) upon invocation (without modifying the workflow code) via `--set-threads` and `--set-resources`, see :ref:`all_options`.
960-
Or they can be defined via workflow :ref:`profiles`, with the variables listed above in the signature for usable callables.
960+
Or they can be defined via workflow :ref:`executing-profiles`, with the variables listed above in the signature for usable callables.
961961
You could, for example, provide the following workflow profile in a file ``profiles/default/config.yaml`` relative to the Snakefile or the current working directory:
962962

963963
.. code-block:: yaml
@@ -1799,6 +1799,8 @@ or the short form
17991799
will generate skeleton code in ``notebooks/hello.py.ipynb`` and additionally print instructions on how to open and execute the notebook in VSCode.
18001800

18011801

1802+
.. _snakefiles_protected_temp:
1803+
18021804
Protected and Temporary Files
18031805
-----------------------------
18041806

@@ -3058,6 +3060,37 @@ To avoid such leaks (only required if your template does something like that wit
30583060
shell:
30593061
"sometool {input} {output}"
30603062
3063+
.. _snakefiles_default_flags:
3064+
3065+
Setting default flags
3066+
---------------------
3067+
3068+
Snakemake allows the annotation of input and output files via so-called flags (see e.g. :ref:`snakefiles_protected_temp`).
3069+
Sometimes, it can be useful to define that a certain flag shall be applied to all input or output files of a workflow.
3070+
This can be achieved via the global ``inputflags`` and ``outputflags`` directives.
3071+
Consider the following example:
3072+
3073+
.. code-block:: python
3074+
3075+
outputflags:
3076+
temp
3077+
3078+
rule a:
3079+
output:
3080+
"test.out"
3081+
shell:
3082+
"echo test > {output}"
3083+
3084+
Would automatically mark the output file of rule ``a`` as temporary.
3085+
The most convenient use case of this mechanism occurs in combination with :ref:`access pattern annotation <storage-access-patterns>`.
3086+
In this case, the default access pattern can be set globally for all output files of a workflow.
3087+
Only a few cases that differ have then to deal with explicit access pattern annotation (see :ref:`storage-access-patterns` for an example).
3088+
Whenever a rule defines a flag for a file, this flag will override the default flag of the same kind or any contradicting default flags (e.g. ``temp`` will override ``protected``).
3089+
3090+
Such default input and output flag specifications are always valid for all rules that follow them in the workflow definition.
3091+
Importantly, they are also "namespaced" per module, meaning that ``inputflags`` and ``outputflags`` directives in a module only apply to the rules defined in that module.
3092+
3093+
30613094
.. _snakefiles_mpi_support:
30623095

30633096
MPI support

docs/snakefiles/storage.rst

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ In general, there are four ways to use a storage provider.
4343
Using the S3 storage plugin, we will provide an example for all of the cases below.
4444
For provider specific options (also for all options of the S3 plugin which are omitted here for brevity) and all available plugins see the `Snakemake plugin catalog <https://snakemake.github.io/snakemake-plugin-catalog>`_.
4545

46+
.. _default_storage:
47+
4648
As default provider
4749
^^^^^^^^^^^^^^^^^^^
4850
If you want all your input and output (which is not explicitly marked to come from
@@ -223,3 +225,60 @@ Usually, this can be done via environment variables, e.g. for S3::
223225

224226
export SNAKEMAKE_STORAGE_S3_ACCESS_KEY=...
225227
export SNAKEMAKE_STORAGE_S3_SECRET_KEY=...
228+
229+
.. _storage-access-patterns:
230+
231+
Access pattern annotation
232+
^^^^^^^^^^^^^^^^^^^^^^^^^
233+
234+
Storage providers can automatically optimize the provision of files based on how the files will be accessed by the respective job.
235+
For example, if a file is only read sequentially, the storage provider can avoid downloading it and instead mount or symlink it (depending on the protocol) for ondemand access.
236+
This can be beneficial, in particular if the sequential access involves only a small part of an otherwise large file.
237+
The three access patterns that can be annotated are:
238+
239+
* ``access.sequential``: The file is read sequentially either from start to end or in (potentially disjoint) chunks, but always in order from the start to the end.
240+
* ``access.random``: The file is read in a non-sequential order.
241+
* ``access.multi``: The file is read sequentially, but potentially multiple times in parallel.
242+
243+
Snakemake considers an input file eligible for on-demand provisioning if it is accessed sequentially by one job in parallel.
244+
In all other cases, multi-access, random access, or sequential access by multiple jobs in parallel, the storage provider will download the file to the local filesystem before it is accessed by jobs.
245+
In case no access pattern is annotated (the default), Snakemake will also download the file.
246+
247+
The access patterns can be annotated via flags.
248+
Usually, one would define sequential access as the default pattern (it should usually be the most common pattern in a workflow).
249+
This can be done via the ``inputflags`` directive before defining any rule.
250+
For specific files, the access pattern can be annotated by the respective flags ``access.sequential``, ``access.random``, or ``access.multi``.
251+
252+
.. code-block:: python
253+
254+
inputflags:
255+
access.sequential
256+
257+
258+
rule a:
259+
input:
260+
access.random("test1.in") # expected as local copy (because accessed randomly)
261+
output:
262+
"test1.out"
263+
shell:
264+
"cmd_b {input} {output}"
265+
266+
267+
rule b:
268+
input:
269+
access.multi("test1.out") # expected as local copy (because accessed multiple times)
270+
output:
271+
"test2.{dataset}.out"
272+
shell:
273+
"cmd_b {input} {output}"
274+
275+
276+
rule c:
277+
input:
278+
"test2.{dataset}.out" # expected as on-demand provisioning (because accessed sequentially, the default defined above)
279+
output:
280+
"test3.{dataset}.out"
281+
shell:
282+
"cmd_c {input} {output}"
283+
284+
Note that there is no guarantee that the storage provider makes use of this information, since the possibilities can vary between storage protocols and the development stage of the storage plugin.

docs/tutorial/interaction_visualization_reporting/tutorial.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ In this case, this config file shall be stored in ``resources/datavzrd/cars.yaml
277277
name:
278278
link-to-url:
279279
Wikipedia:
280-
url: "https://en.wikipedia.org/wiki/{value}"
280+
url: "https://en.wikipedia.org/wiki/{value}"
281281
miles per gallon:
282282
plot:
283283
ticks:
@@ -451,4 +451,4 @@ and open the file ``report/report.html`` file in your browser.
451451
You will see that the report brings all desired outputs together in a structured way, including the captions and the global description.
452452
It not only allows to view the results, but also to explore the code of each rule and all involved parameters and software tools.
453453
This way, it generates transparency without requiring people to manually inspect the workflow codebase.
454-
In many ways, it can be seen as a self-contained next generation supplementary file of a scientific manuscript.
454+
In many ways, it can be seen as a self-contained next generation supplementary file of a scientific manuscript.

0 commit comments

Comments
 (0)