-
-
Notifications
You must be signed in to change notification settings - Fork 12k
Description
Proposed new feature or change:
Discussing with @betatim, sklearn sometimes would like to re-use the same random number state (e.g. for splitting data the same way, but potentially many times).
I am not sure about whether that is a good idea, but I assume there is a need for this. So this is to track/get feedback if we should add a .clone() or .copy() method to the rng/bit_generator.
Right now, the best pattern I could think of is to basically do:
def __init__(self, rng=None)
# maybe spawn a new one to have one for ourself
# (not sure how it exactly looks like)
self._blueprint_rng, = new_rng(rng) or rng.spawn(1)
def method(self):
rng = copy.deepcopy(self._blueprint_rng)
# use rng
The reason is that we have to work with a copy because otherwise threading will be broken. deepcopy works, and it might be nice to implement __deepcopy__ to reduce the overhead a bit (I think this could be <1us rather than >10us easily).
But the other question is whether copy.deepcopy() is an obvious enough solution to begin with, or whether it wouldn't be better to have an explicit method to make this easy?
Or maybe I am missing a nicer pattern to have such a "rewinding", rng?