Skip to content

Commit 024dc32

Browse files
johannaschmitzJZentgrafjohannaelenaschmitz
authored
feat: Dynamic module name (snakemake#3401)
Allow for dynamic module names resolves snakemake#1923 ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced modular workflows enable dynamic and semi-dynamic module configuration with flexible aliasing and output management. - **Documentation** - Updated guidelines provide clear instructions on configuring dynamic module imports and rule aliasing, including a new section on "Dynamic Modules." - **Tests** - Expanded test coverage validates diverse module configurations and expected outcomes, ensuring robust and reliable workflows. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Jens Zentgraf <[email protected]> Co-authored-by: Johanna <[email protected]>
1 parent 47504a0 commit 024dc32

File tree

46 files changed

+911
-9
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+911
-9
lines changed

docs/snakefiles/modularization.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,47 @@ Otherwise, you will have two versions of the same rule, which might be unintende
201201

202202
Of course, it is possible to combine the use of rules from multiple modules (see :ref:`use_with_modules`), and via modifying statements they can be rewired and reconfigured in an arbitrary way.
203203

204+
.. _snakefiles-dynamic-modules:
205+
206+
---------------
207+
Dynamic Modules
208+
---------------
209+
210+
With Snakemake 9.0 and later, it is possible to load modules dynamically by providing the ``name`` keyword inside the module definition.
211+
For example, by reading the module name from a config file or by iterating over several modules in a loop.
212+
For this, the module name is not specified directly after the ``module`` keyword, but by specifying the ``name`` parameter.
213+
214+
215+
.. code-block:: python
216+
217+
for module_name in ['module1', 'module2']:
218+
module:
219+
name: module_name
220+
snakefile: f"{module_name}/Snakefile"
221+
config: config[module_name]
222+
223+
use rule * from module_name as module_name*
224+
225+
.. note::
226+
It is not allowed to specify the module name both after the ``module`` keyword and inside the module definition after the ``name`` parameter.
227+
228+
In the ``use rule`` statement, it is first checked if the module name (here, ``'module_name'``) corresponds to a loaded module. If yes, the rules are imported from the loaded module and an arbitrary alias can be provided after the ``as`` keyword.
229+
230+
If ``module_name`` was not registered as a module (as in the example above), the module name is resolved dynamically by searching the name in the current python variable scope. In the example, it resolves to ``'module1'`` and ``'module2'``.
231+
Note that this means that if ``use rule`` is used with the optional ``as`` keyword inside the loop, the alias after ``as`` must be specified using a variable to ensure a one-to-one mapping between module names and their aliases. This can either be the same name variable (as in the above example) or a second variable (as in the example below).
232+
233+
In particular, it is not possible to modify the alias name in the ``use rule`` statement (e.g., writing directly ``use rule * from module as module_*`` is not allowed for dynamic modules).
234+
235+
.. code-block:: python
236+
237+
for module_name, alias in zip(['module1', 'module2'], ['module1_', 'module2_']):
238+
module:
239+
name: module_name
240+
snakefile: f"{module_name}/Snakefile"
241+
config: config[module_name]
242+
243+
use rule * from module_name as alias*
244+
204245
.. _snakefiles-meta-wrappers:
205246

206247
~~~~~~~~~~~~~

src/snakemake/parser.py

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -867,6 +867,16 @@ class OnStart(DecoratorKeywordState):
867867
class ModuleKeywordState(SectionKeywordState):
868868
prefix = "Module"
869869

870+
def start(self):
871+
yield f"{self.keyword}="
872+
873+
def end(self):
874+
yield ","
875+
876+
877+
class ModuleName(ModuleKeywordState):
878+
pass
879+
870880

871881
class ModuleSnakefile(ModuleKeywordState):
872882
pass
@@ -900,6 +910,7 @@ def keyword(self):
900910

901911
class Module(GlobalKeywordState):
902912
subautomata = dict(
913+
name=ModuleName,
903914
snakefile=ModuleSnakefile,
904915
meta_wrapper=ModuleMetaWrapper,
905916
config=ModuleConfig,
@@ -913,10 +924,21 @@ def __init__(self, snakefile, base_indent=0, dedent=0, root=True):
913924
self.state = self.name
914925
self.has_snakefile = False
915926
self.has_meta_wrapper = False
927+
self.modulename = None
916928
self.has_name = False
917929
self.primary_token = None
918930

919931
def end(self):
932+
if self.modulename is not None:
933+
yield f"name={self.modulename!r}\n"
934+
elif not self.has_name:
935+
self.error(
936+
"Missing module name. "
937+
"A module name must be provided either after the module keyword or "
938+
"inside the module definition after the name keyword.",
939+
self.primary_token,
940+
)
941+
920942
if not (self.has_snakefile or self.has_meta_wrapper):
921943
self.error(
922944
"A module needs either a path to a Snakefile or a meta wrapper URL.",
@@ -926,14 +948,17 @@ def end(self):
926948

927949
def name(self, token):
928950
if is_name(token):
929-
yield f"workflow.module({token.string!r}", token
951+
self.modulename = token.string
930952
self.has_name = True
931-
elif is_colon(token) and self.has_name:
953+
elif is_colon(token):
932954
self.primary_token = token
933955
self.state = self.block
956+
yield "workflow.module(", token
934957
else:
935958
self.error(
936-
"Expected name after module keyword.", token, naming_hint="module"
959+
"Expected name or colon after module keyword.",
960+
token,
961+
naming_hint="module",
937962
)
938963

939964
def block_content(self, token):
@@ -943,6 +968,16 @@ def block_content(self, token):
943968
self.has_snakefile = True
944969
if token.string == "meta_wrapper":
945970
self.has_meta_wrapper = True
971+
if token.string == "name":
972+
if self.has_name:
973+
raise self.error(
974+
"Ambiguous module name. "
975+
"A module name was provided directly after the module keyword. "
976+
"Another module name was provided by the name keyword.",
977+
token,
978+
naming_hint="module",
979+
)
980+
self.has_name = True
946981
for t in self.subautomaton(token.string, token=token).consume():
947982
yield t
948983
except KeyError:

src/snakemake/workflow.py

Lines changed: 42 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2252,14 +2252,15 @@ def run(self, func):
22522252

22532253
def module(
22542254
self,
2255-
name,
2255+
name=None,
22562256
snakefile=None,
22572257
meta_wrapper=None,
22582258
config=None,
22592259
skip_validation=False,
22602260
replace_prefix=None,
22612261
prefix=None,
22622262
):
2263+
22632264
self.modules[name] = ModuleInfo(
22642265
self,
22652266
name,
@@ -2282,16 +2283,51 @@ def userule(
22822283
def decorate(maybe_ruleinfo):
22832284
if from_module is not None:
22842285
try:
2286+
modifier = name_modifier
22852287
module = self.modules[from_module]
22862288
except KeyError:
2287-
raise WorkflowError(
2288-
"Module {} has not been registered with 'module' statement before using it in 'use rule' statement.".format(
2289-
from_module
2289+
# Dynamic module name resolution:
2290+
# If the static module name is not found in the registered modules,
2291+
# we check if it's a variable in the current scope.
2292+
from inspect import currentframe
2293+
2294+
if from_module in currentframe().f_back.f_globals:
2295+
module_name = currentframe().f_back.f_globals[from_module]
2296+
if module_name not in self.modules:
2297+
raise WorkflowError(
2298+
"Dynamic module name '{}' resolves to '{}', but has not been registered with a 'module' statement.".format(
2299+
from_module, module_name
2300+
)
2301+
)
2302+
module = self.modules[module_name]
2303+
2304+
# For dynamic module names, the name modifier must also be adjusted dynamically
2305+
# to avoid ambiguous module names. If name_modifier ends with '*',
2306+
# use the variable value plus '*', otherwise use the variable value directly.
2307+
if name_modifier is not None:
2308+
try:
2309+
if name_modifier.endswith("*"):
2310+
modifier = f"{currentframe().f_back.f_globals[name_modifier[:-1]]}*"
2311+
else:
2312+
modifier = currentframe().f_back.f_globals[
2313+
name_modifier
2314+
]
2315+
except KeyError:
2316+
raise WorkflowError(
2317+
"Module alias {} not in current frame to resolve dynamic module {} in 'use rule'.".format(
2318+
name_modifier, module_name
2319+
)
2320+
)
2321+
2322+
else:
2323+
raise WorkflowError(
2324+
"Module {} has not been registered with 'module' statement before using it in 'use rule' statement.".format(
2325+
from_module
2326+
)
22902327
)
2291-
)
22922328
module.use_rules(
22932329
rules,
2294-
name_modifier,
2330+
modifier,
22952331
exclude_rules=exclude_rules,
22962332
ruleinfo=None if callable(maybe_ruleinfo) else maybe_ruleinfo,
22972333
skip_global_report_caption=self.report_text
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
shell.executable("bash")
2+
3+
4+
configfile: "config/config.yaml"
5+
6+
for module_name, use_as in zip(["module1", "module2"], ['use_as_1', 'use_as_2']):
7+
module:
8+
name: module_name
9+
snakefile:
10+
f"{module_name}/Snakefile"
11+
config:
12+
config
13+
replace_prefix:
14+
{"results/": "results/testmodule/"}
15+
16+
use rule * from module_name as use_as*
17+
18+
rule all:
19+
input:
20+
multiext(expand("results/testmodule/c1/{name}.", name="test")[0], "tsv", "txt"),
21+
multiext(expand("results/testmodule/c2/{name}.", name="test")[0], "tsv", "txt"),
22+
23+
24+
assert module1.some_func() == 15
25+
assert module2.some_func() == 25
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
test: 1
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
1
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
1
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
configfile: "config.yaml" # does not exist, but this statement should be ignored on module import
2+
3+
4+
def some_func():
5+
return 15
6+
7+
8+
rule a:
9+
output:
10+
temp("results/a1/{name}.out"),
11+
shell:
12+
"echo {config[test]} > {output}"
13+
14+
15+
rule b:
16+
input:
17+
expand(rules.a.output, name="test"),
18+
output:
19+
"results/b1/{name}.out",
20+
shell:
21+
"cat {input} > {output}"
22+
23+
24+
rule c_tsv:
25+
input:
26+
expand(rules.b.output, name="test"),
27+
output:
28+
"results/c1/{name}.tsv",
29+
shell:
30+
"cat {input} > {output}"
31+
32+
33+
use rule c_tsv as c_txt with:
34+
output:
35+
"results/c1/{name}.txt",
36+
37+
38+
rule all:
39+
input:
40+
expand(rules.c_tsv.output, name="test"),
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
configfile: "config.yaml" # does not exist, but this statement should be ignored on module import
2+
3+
4+
def some_func():
5+
return 25
6+
7+
8+
rule a:
9+
output:
10+
temp("results/a2/{name}.out"),
11+
shell:
12+
"echo {config[test]} > {output}"
13+
14+
15+
rule b:
16+
input:
17+
expand(rules.a.output, name="test"),
18+
output:
19+
"results/b2/{name}.out",
20+
shell:
21+
"cat {input} > {output}"
22+
23+
24+
rule c_tsv:
25+
input:
26+
expand(rules.b.output, name="test"),
27+
output:
28+
"results/c2/{name}.tsv",
29+
shell:
30+
"cat {input} > {output}"
31+
32+
33+
use rule c_tsv as c_txt with:
34+
output:
35+
"results/c2/{name}.txt",
36+
37+
38+
rule all:
39+
input:
40+
expand(rules.c_tsv.output, name="test"),
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
shell.executable("bash")
2+
3+
configfile: "config/config.yaml"
4+
5+
for module_name in ["module1", "module2", "module3"]:
6+
module:
7+
name: module_name
8+
snakefile:
9+
f"{module_name}/Snakefile"
10+
config:
11+
config
12+
replace_prefix:
13+
{"results/": "results/testmodule/"}
14+
15+
16+
use rule a from module1 as rule_a with:
17+
output:
18+
"results/testmodule/a2/{name}.out"
19+
20+
use rule b,c from module2
21+
22+
use rule * from module3 exclude a,b,c
23+
24+
25+
rule all:
26+
input:
27+
expand("results/testmodule/b2/{name}.txt", name="test"),
28+
expand("results/testmodule/b2/{name}.tsv", name="test"),

0 commit comments

Comments
 (0)