Summary of new features for named input/output.

### Persistent grouping of `sos_targets`

```
input: `a.txt`, `b.txt`, group_by=1
```
will be considered as equivalent to
```
input: sos_targets('a.txt', 'b.txt', group_by=1)
```
which creates a `sos_targets` with two targets and two groups, with groups accessible with property `groups`, which is a list of `sos_targets` with  no subgroups.

`sos_targets` will keep its grouping information when it is passed around. That is to say

* `step_input` will have groups that are essentially `_input` for substeps.
* `step_output` will contain `_output` from each substep as its groups.

### keyword arguments in input and output

Keyword arguments used to specify `sources` of targets.
```
input: name=targets
output: name=targets
```
Named input and output can be accessed by `_input['name']` and `_output['name']`.

Implementation-wise, 

```
input: name=targets
```
creates `step_input` as `sos_targets(name=targets)`, which assigns `sources` of `targets` to `name`.

### `output_from(steps, **kwargs)` to get output from other steps

Refers to output from one or more steps, parameter can be a name or a number. The latter refers to a step in the same workflow (`output_from(10)` from `step_20` is equivalent to `output_from('step_10')`). 

```
input: output_from('step')
input: output_from(1)
input: output_from([1, 2])
```
with named input and output, the syntax can be expanded to
```
input: ref=output_from('get_ref')['ref']
```
A special step name `-1` as in
```
input: output_from(-1)
```
is reserved to output from previous step, which is only valid from a numerically indexed steps.

Options `group_by`, `paired_with`, `pattern`, `group_with`, and `for_each` can be used to regroup or attach variables to the output. For example, `group_by` can be used to regroup the retrieved `sos_targets`,
```
input: output_from(10, group_by='all')
```

### `named_output('name', **kwargs)` for data flow without step name

`named_output('ref')` in the following example refers to any step with `ref` in named output,

```
[A]
output: ref=targets

[B]
input: named_output('ref')
```
which has the same effect with `output_from('A')['ref']` but does not need the specification of step name.

Similar to `output_from`, parameters `group_by`, `paired_with`, `pattern`, `group_with`, `for_each` can be used to regroup or attached variables to retrieved targets.

### Merging of multiple `sos_targets`

Multiple `sos_targets` can be specified in the input statement, either explicitly with `sos_targets`, or implicitly with `output_from`, `named_output`. In this case, targets and groups from multiple `sos_targets` will be merged. `sos_targets` objects with different numbers of groups can be merged only if one of them has no group information or has a single group with all targets. In this case the group will be replicated for all groups before merging.

For example,

```
input: 'a.txt', 'b.txt', sos_targets('c.txt', 'd.txt', group_by=1)
```
will create a `sos_targets` with four targets `'a.txt', 'b.txt', 'c.txt', 'd.txt'`, and two groups

```
'a.txt', 'b.txt', 'c.txt'
'a.txt', 'b.txt', 'd.txt'
```

The same rule applies to `sos_targets` created by `output_from()` or `output_from(group_by)`. However, if a global `group_by` option is present, all individual `groups` will be overridden. That is to say, 

```
input: 'a.txt', 'b.txt', output_from(10), group_by=1
```
will regroup all targets by `1`, regardless of original grouping information from `output_from(10)`.


### `set` and `get` of attributes to sos targets

New functions are added `BaseTarget.set()`, `BaseTarget.get()`

A dictionary are now associated with each `BaseTarget` and can be access with `.set()` and `.get()` function, or as an attribute of the target. The `.set()` function is usually done automatically by parameters `paired_with` and `group_with`, but can be used directly. With

```
a = file_target('a.txt')
a.set('name', 'a')
```
it is usually easier to use
```
a.name
```
instead of
```
a.get('name')
```
but `a.get('name', default=None)` will return a default value instead of raising an `AttributeError` if `name` does not exist, which can be safer to use from time to time.

### Changes to parameters `paired_with`, `group_with` and `for_each`

In addition to variables set to the global namespace, the paired values are written to `_input` as target or group properties. That is to say, with

```
sample = ['A',  'B']
files = ['a1', 'a2', 'a3', 'a4']
input: 'a1.txt', 'a2.txt', 'b1.txt', 'b2.txt', group_by=2, 
    paired_with='files', group_with='sample', for_each=dict(i=range(5))
```
you can access `_sample`, `_files`, and `i` both directly, and as

```
_input[0]._files
_input._sample
_input.i
```
So that
```
sample = ['A',  'B']
files = ['a1', 'a2', 'a3', 'a4']
input: 'a1.txt', 'a2.txt', 'b1.txt', 'b2.txt', group_by=2, 
    paired_with='files', group_with='sample', for_each=dict(i=range(5))

print(f'_input={_input}, _files={_files}, _sample={_sample}, i={i}')
print(f'_input[0]._files={_input[0]._files}, _input._sample={_input._sample}, _input.i={_input.i}')
```
would produce:
```
_input=a1.txt a2.txt, _files=['a1', 'a2'], _sample=A, i=0
_input[0]._files=a1, _input._sample=A, _input.i=0
_input=b1.txt b2.txt, _files=['a3', 'a4'], _sample=B, i=0
_input[0]._files=a3, _input._sample=B, _input.i=0
...
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Summary of new features for named input/output. #1115

Persistent grouping of `sos_targets`

keyword arguments in input and output

`output_from(steps, **kwargs)` to get output from other steps

`named_output('name', **kwargs)` for data flow without step name

Merging of multiple `sos_targets`

`set` and `get` of attributes to sos targets

Changes to parameters `paired_with`, `group_with` and `for_each`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Summary of new features for named input/output. #1115

Description

Persistent grouping of sos_targets

keyword arguments in input and output

output_from(steps, **kwargs) to get output from other steps

named_output('name', **kwargs) for data flow without step name

Merging of multiple sos_targets

set and get of attributes to sos targets

Changes to parameters paired_with, group_with and for_each

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Persistent grouping of `sos_targets`

`output_from(steps, **kwargs)` to get output from other steps

`named_output('name', **kwargs)` for data flow without step name

Merging of multiple `sos_targets`

`set` and `get` of attributes to sos targets

Changes to parameters `paired_with`, `group_with` and `for_each`