|
4 | 4 | Best practices |
5 | 5 | ============== |
6 | 6 |
|
7 | | -* Snakemake (>=5.11) comes with a code quality checker (a so called linter), that analyzes your workflow and highlights issues that should be solved in order to follow best practices, achieve maximum readability, and reproducibility. |
8 | | - The linter can be invoked with |
| 7 | +Care about code quality |
| 8 | +----------------------- |
9 | 9 |
|
10 | | - .. code-block:: bash |
| 10 | +Snakemake (>=5.11) comes with a code quality checker (a so called linter), that analyzes your workflow and highlights issues that should be solved in order to follow best practices, achieve maximum readability, and reproducibility. |
| 11 | +The linter can be invoked with |
| 12 | + |
| 13 | +.. code-block:: bash |
11 | 14 |
|
12 | 15 | snakemake --lint |
13 | 16 |
|
14 | | - given that a ``Snakefile`` or ``workflow/Snakefile`` is accessible from your working directory. |
15 | | - It is **highly recommended** to run the linter before publishing any workflow, asking questions on Stack Overflow or filing issues on Github. |
16 | | -* There is an automatic formatter for Snakemake workflows, called `Snakefmt <https://github.com/snakemake/snakefmt>`_, which should be applied to any Snakemake workflow before publishing it. |
17 | | -* When publishing your workflow in a `Github <https://github.com>`_ repository, it is a good idea to add some minimal test data and configure `Github Actions <https://github.com/features/actions>`_ for continuously testing the workflow on each new commit. |
18 | | - For this purpose, we provide predefined Github actions for both running tests and linting `here <https://github.com/snakemake/snakemake-github-action>`__, as well as formatting `here <https://github.com/snakemake/snakefmt#github-actions>`__. |
19 | | -* For publishing and distributing a Snakemake workflow, it is a good idea to stick to a :ref:`standardized structure <distribution_and_reproducibility>` that is expected by frequent users of Snakemake. |
20 | | - The `Snakemake workflow catalog <https://snakemake.github.io/snakemake-workflow-catalog>`_ automatically lists Snakemake workflows hosted on `Github <https://github.com>`_ if they follow certain `rules <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_. |
21 | | - By complying to these `rules <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_ you can make your workflow more discoverable and even automate its usage documentation (see `"Standardized usage" <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_). |
22 | | -* Configuration of a workflow should be handled via :ref:`config files <snakefiles_standard_configuration>` and, if needed, tabular configuration like sample sheets (either via :ref:`Pandas <snakefiles_tabular_configuration>` or :ref:`PEPs <snakefiles-peps>`). |
23 | | - Use such configuration for metadata and experiment information, **not for runtime specific configuration** like threads, resources and output folders. |
24 | | - For those, just rely on Snakemake's CLI arguments like ``--set-threads``, ``--set-resources``, ``--set-default-resources``, and ``--directory``. |
25 | | - This makes workflows more readable, scalable, and portable. |
26 | | -* Try to keep filenames short (thus easier on the eye), but informative. Avoid mixing of too many special characters (e.g. decide whether to use ``_`` or ``-`` as a separator and do that consistently throughout the workflow). |
27 | | -* Try to keep Python code like helper functions separate from rules (e.g. in a ``workflow/rules/common.smk`` file). This way, you help non-experts to read the workflow without needing to parse internals that are irrelevant for them. The helper function names should be chosen in a way that makes them sufficiently informative without looking at their content. Also avoid ``lambda`` expressions inside of rules. |
28 | | -* Make use of `Snakemake wrappers <https://snakemake-wrappers.readthedocs.io>`_ whenever possible. Consider contributing to the wrapper repo whenever you have a rule that reoccurs in at least two of your workflows. |
| 17 | +given that a ``Snakefile`` or ``workflow/Snakefile`` is accessible from your working directory. |
| 18 | +It is **highly recommended** to run the linter before publishing any workflow, asking questions on Stack Overflow or filing issues on Github. |
| 19 | + |
| 20 | +Care about code readability |
| 21 | +--------------------------- |
| 22 | + |
| 23 | +1. There is an automatic formatter for Snakemake workflows, called `Snakefmt <https://github.com/snakemake/snakefmt>`_, which should be applied to any Snakemake workflow before publishing it. |
| 24 | +2. Try to keep filenames short (thus easier on the eye), but informative. Avoid mixing of too many special characters (e.g. decide whether to use ``_`` or ``-`` as a separator and do that consistently throughout the workflow). |
| 25 | +3. Try to keep Python code like helper functions separate from rules (e.g. in a ``workflow/rules/common.smk`` file). This way, you help non-experts to read the workflow without needing to parse internals that are irrelevant for them. The helper function names should be chosen in a way that makes them sufficiently informative without looking at their content. Also avoid ``lambda`` expressions inside of rules. |
| 26 | +4. Use Snakemake's :ref:`semantic helper functions <snakefiles-semantic-helpers>` in order to increase readability and to avoid the reimplementation of common functionality for aggregation, parameter lookup or path modifications. |
| 27 | + |
| 28 | +Ensure portability |
| 29 | +------------------ |
| 30 | + |
| 31 | +Annotate all your rules with versioned :ref:`Conda <integrated_package_management>` or :ref:`container <apptainer>` based software environment definitions. This ensures that your workflow utilizes the exactly same isolated software stacks, independently of the underlying system. |
| 32 | + |
| 33 | +Generate interactive reports (for free) |
| 34 | +--------------------------------------- |
| 35 | + |
| 36 | +Annotate your final results for including into Snakemake's automatic :ref:`interactive reports <snakefiles-reports>` (thereby make sure to use all the features, including categories and labels). |
| 37 | +This makes them explorable in a high-level way, while connecting them to the workflow code, parameters, and software stack. |
| 38 | + |
| 39 | +Enable configurability |
| 40 | +---------------------- |
| 41 | + |
| 42 | +Configuration of a workflow should be handled via :ref:`config files <snakefiles_standard_configuration>` and, if needed, tabular configuration like sample sheets (either via :ref:`Pandas <snakefiles_tabular_configuration>` or :ref:`PEPs <snakefiles-peps>`). |
| 43 | +Use such configuration for metadata and experiment information, **not for runtime specific configuration** like threads, resources and output folders. |
| 44 | +For those, just rely on Snakemake's CLI arguments like ``--set-threads``, ``--set-resources``, ``--set-default-resources``, and ``--directory``. |
| 45 | +This makes workflows more readable, scalable, and portable. |
| 46 | + |
| 47 | +Avoid duplication of efforts |
| 48 | +---------------------------- |
| 49 | + |
| 50 | +Make use of `Snakemake wrappers <https://snakemake-wrappers.readthedocs.io>`_ whenever possible. Consider contributing to the wrapper repo whenever you have a rule that reoccurs in at least two of your workflows. |
| 51 | + |
| 52 | +Test your workflow continuously |
| 53 | +------------------------------- |
| 54 | + |
| 55 | +When hosting your workflow in a `Github <https://github.com>`_ repository, it is a good idea to add some minimal test data and configure `Github Actions <https://github.com/features/actions>`_ for continuously testing the workflow on each new commit.For this purpose, we provide predefined Github actions for both running tests and linting `here <https://github.com/snakemake/snakemake-github-action>`__, as well as formatting `here <https://github.com/snakemake/snakefmt#github-actions>`__. |
| 56 | + |
| 57 | +Follow the standards |
| 58 | +-------------------- |
| 59 | + |
| 60 | +1. For publishing and distributing a Snakemake workflow, it is a good idea to stick to a :ref:`standardized folder structure <distribution_and_reproducibility>` that is expected by frequent users of Snakemake. This simplifies the navigation through the codebase and keeps the workflow repository and the working directory clean. |
| 61 | +2. The `Snakemake workflow catalog <https://snakemake.github.io/snakemake-workflow-catalog>`_ automatically lists Snakemake workflows hosted on `Github <https://github.com>`_ if they follow certain `rules <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_. |
| 62 | + By complying to these `rules <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_ you can make your workflow more discoverable and even automate its usage documentation (see `"Standardized usage" <https://snakemake.github.io/snakemake-workflow-catalog/?rules=true>`_). |
0 commit comments