Skip to content

Conversation

@vfdev-5
Copy link
Contributor

@vfdev-5 vfdev-5 commented May 3, 2018

Addresses the issue #7209

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no randomness involved. Every time the workers will start, the state of their PRNGs will always be fully determined by 7054 + worker_nr. Let's just move the base_seed out of the if statement. I can't see any issue with that.

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented May 3, 2018

There's no randomness involved. Every time the workers will start, the state of their PRNGs will always be fully determined by 7054 + worker_nr

@apaszke sorry I don't get your comment on "no randomness involved". Worker's PRNG is setup by a seed randomly picked between 0 and 7054 + worker_id, so yes workers seed is determined by current rng state and the interval [0, 7054 + i].

Similarly if base_seed were picked randomly (determined by current rng) and then worker's RNG would be setup by base_seed + worker_id.

Okay, I can change as you ask...

@apaszke
Copy link
Contributor

apaszke commented May 3, 2018

Hmmm ok I think that could have worked, my bad. However, I still think that my solution is better, because yours assumes that you've forked the workers, and inherited the PRNG state from the main process, while in reality someone might be using the spawn start method, and will get a completely different behavior.

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented May 3, 2018

I see, thanks for pointing out on spawn !

I changed the commit

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented May 3, 2018

@apaszke I rechecked my scripts with mp.set_start_method('spawn') and I observe similar behavior as with fork on randomness of the workers.
Actually, even if parent RNG is not inherited in the worker it is not worse, it is even better for randomness of the worker and it keeps parent RNG state before calling _put_indices() and thus keeps batch indices reproducible. But I understand that this leads to undeterministic RNG setup of workers.

@vfdev-5
Copy link
Contributor Author

vfdev-5 commented May 4, 2018

@apaszke it is OK that it fails only on macosx ?

@apaszke apaszke merged commit 6363faf into pytorch:master May 4, 2018
@apaszke
Copy link
Contributor

apaszke commented May 4, 2018

Thanks @vfdev-5!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants