Skip to content

Conversation

@dehuacheng
Copy link

Summary:
In some cases, for example, when we training on CTR data, we would like to start training from old samples and finish on new recent samples.

This diff add the option to disable the shuffling in DistributedSampler to accommodate this use case.

Differential Revision: D16100388

@pytorchbot pytorchbot added the module: dataloader Related to torch.utils.data.DataLoader and Sampler label Jul 2, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break this out into a separate if/else block, instead inlining if/else inside indices = (

@soumith
Copy link
Contributor

soumith commented Jul 3, 2019

I'm also really confused. If you dont need random shuffling, you dont need a DistributedSampler at all, right?
It will just be a pure iterator with no shuffling

Summary:
Pull Request resolved: #22479

In some cases, for example, when we training on CTR data, we would like to start training from old samples and finish on new recent samples.

This diff add the option to disable the shuffling in DistributedSampler to accommodate this use case.

Differential Revision: D16100388

fbshipit-source-id: e0d31328920a4aa2315820683242cc44959c5d82
@soumith
Copy link
Contributor

soumith commented Jul 6, 2019

okay, nevermind, this actually makes sense to effectively do sharding without shuffling. It took me a second to wrap my head around this

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7730346.

@ssnl
Copy link
Collaborator

ssnl commented Jul 6, 2019

test?

@soumith
Copy link
Contributor

soumith commented Jul 9, 2019

my bad on the code review. i should've requested a test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: dataloader Related to torch.utils.data.DataLoader and Sampler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants