Skip to content

Bug: edge case when there is a huge shell command with shadow: 'minimal' #3891

@Hocnonsense

Description

@Hocnonsense

rule in snakefile:

rule collect_syldb:
    input:
        expand(rules.sylph_fq.output.sylphmpa, sample=samples),
    output:
        tsv="results/sylph-relative_abundance.tsv.gz",
    conda:
        "sylph"
    threads: 1
    shadow:
        "minimal"
    shell:
        """
        sylph-tax merge \
            {input} \
            --column relative_abundance \
            -o tmp.tsv
        if [ "{output.tsv}" == *.gz ]
        then
            gzip tmp.tsv
        fi
        mv tmp.tsv* {output.tsv}
        """

snakemake log:

host: node012.pi.sjtu.edu.cn
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Select jobs to execute...
Execute 1 jobs...

[Fri Dec 19 17:35:48 2025]
localrule collect_syldb_omd:
    input: results/sylph/samples/TARA_SAMEA2619625_METAG-sylph.taxprof.tsv, results/sylph/samples/TARA_SAMEA2619872_METAG-sylph.taxprof.tsv, results/>
    output: results/sylph-relative_abundance.tsv.gz
    jobid: 0
    reason: Forced execution
    resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=/tmp, partition='64c512g', nodes=1
Shell command: echo 1
        sylph-tax merge             results/sylph/samples/TARA_SAMEA2619625_METAG-sylph.taxprof.tsv, results/sylph/samples/TARA_SAMEA2619872_METAG-sy>
        if [ "results/sylph-relative_abundance.tsv.gz" == *.gz ]
        then
            gzip tmp.tsv
        fi
        mv tmp.tsv* results/sylph-relative_abundance.tsv.gz
        
Changing to shadow directory: .snakemake/shadow/tmpt9uw99ay
RuleException:
FileNotFoundError in file "workflow/Snakefile", line 493:
[Errno 2] No such file or directory: '.snakemake/shell_tmp.3w03dm25'
  File "conda/snakemake/lib/python3.12/tempfile.py", line 3>
[Fri Dec 19 17:36:16 2025]
Error in rule collect_syldb_omd:
    message: None
    jobid: 0
    input: results/sylph/samples/TARA_SAMEA2619625_METAG-sylph.taxprof.tsv, results/sylph/samples/TARA_SAMEA2619872_METAG-sylph.taxprof.tsv, results/>
    output: 02_magdb/omeerfsd/subdb/ref_bins-sylph/omd-sylph-relative_abundance.tsv.gz
    conda-env: sylph
    shell:
        echo 1
        sylph-tax merge             results/sylph/samples/TARA_SAMEA2619625_METAG-sylph.taxprof.tsv, results/sylph/samples/TARA_SAMEA2619872_METAG-sy>
        if [ "results/sylph-relative_abundance.tsv.gz" == *.gz ]
        then
            gzip tmp.tsv
        fi
        mv tmp.tsv* results/sylph-relative_abundance.tsv.gz
        
        (command exited with non-zero exit code)
Shutting down, this might take some time.
logs/collect_syldb/collect_syldb--51137316.err lines 1-46/67 67%

details:
I think it is potentially related to these lines:

if len(cmd.replace("'", r"'\''")) + 2 > MAX_ARG_LEN:
tmpdir = tempfile.mkdtemp(dir=".snakemake", prefix="shell_tmp.")
script = os.path.join(os.path.abspath(tmpdir), "script.sh")
with open(script, "w") as script_fd:
print(cmd, file=script_fd)
os.chmod(script, os.stat(script).st_mode | stat.S_IXUSR | stat.S_IRUSR)
cmd = '"{}" "{}"'.format(cls.get_executable() or "/bin/sh", script)

with a large sample list, the shell command can be larger than MAX_ARG_LEN, and write to a file inner the snakemake folder.
However, after changing to the "minimal" shadow folder, the shell script cannot be accessed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions