-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Improve data prefeteching efficiency #1535
Copy link
Copy link
Closed
Labels
Description
Current implementation prefetches only a single batch after the last batch was used for computations. The initial motivation for this design is probably to avoid thread synchronization. There may not be maximal overlap between computation and data IO. This problem becomes a serious bottleneck when multiple devices simultaneously train the data (#1148).
The IO efficiency can be increased by continuously prefetching multiple batches without waiting for the computation thread and storing them in a thread safe buffer with limited capacity.
Reactions are currently unavailable