[Also from my colleague] consider a workflow that looks like this (code will not work, just showing idea):
[1]
input: for_each = ['files_sra'], group_by = 1, concurrent = True
output: files_sra
download:
{url}/{_files_sra}
[2]
output: files_bam
run:
sratools-dump ${_input} ${_output}
The problem is that the data size is huge and can barely fit into a disk. So ideally we need to immediately zap output of step 1 after step 2 is done. Is there a build-in mechanism for it?