[Feature request] Let DistributedSampler take a Sampler as input

## 🚀 Feature



## Motivation

Currently, `DistributedSampler` assumes that it takes a `Dataset` as argument. But in reality, the only information it exploits from it is its `len`.

We sometimes want to have a custom Sampler to be used in distributed mode. So it might be desirable to also let `DistributedSampler` take a `Sampler` as argument.

## Potential implementation

The only difference is that in 

https://github.com/pytorch/pytorch/blob/46224ef89edcd0e350d1572bb4d12d370245cc6e/torch/utils/data/distributed.py#L57-L61
We would additionally have something like

```python
if isinstance(self.dataset, Sampler):
    orig_indices = list(iter(self.dataset))
    indices = [orig_indices[i] for i in indices]

return iter(indices)
```

## Pitch

More modularity and code reuse
```python
sampler = MyNiceSampler(dataset)
if distributed:
    sampler = DistributedSampler(sampler)
```

Additionally, it make writing code more (in my view) clear. Instead of
```python
if distributed:
    sampler = DistributedSampler(dataset)
else:
    sampler = RandomSampler(dataset)
```
we can always have
```python
sampler = RandomSampler(dataset)
if distributed:
    sampler = DistributedSampler(sampler, shuffle=False)
```
which, at first sight might seem very similar, but they imply different things.

## Alternatives

We can integrate the functionality of `DistributedSampler` inside our custom sampler, but this seems redundant.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @SsnL @VitalyFedyunin @ejguan @NivekT @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang

	# subsample
	indices = indices[self.rank:self.total_size:self.num_replicas]
	assert len(indices) == self.num_samples

	return iter(indices)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Let DistributedSampler take a Sampler as input #23430

🚀 Feature

Motivation

Potential implementation

Pitch

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature request] Let DistributedSampler take a Sampler as input #23430

Description

🚀 Feature

Motivation

Potential implementation

Pitch

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions