-
Notifications
You must be signed in to change notification settings - Fork 75.2k
Description
Where is the batch normalization implementation for Multi-GPU scenarios? How does one keep track of mean, variance, offset and scale in the context of the Multi-GPU example as given in the CIFAR-10 tutorial?
Why is the question on StackOverflow left unanswered for so long?
For all the beauty that it brings with Tensorboard etc.. , it's kinda appalling to see Tensorflow so far behind Torch in terms of its modeling capability. I'd be really glad if someone takes up responsibility and comes up with a decent Batch Normalization implementation for all cases. Even if it is already there, could anyone care enough to make a good documentation out of it?
There are so many issues pertaining to batch normalization with Tensorflow. It's important that you guys straighten this out as batch normalization enables super-fast convergence for very deep networks and it is REALLY important for modern day deep learning research.
PS: Please spare my outburst. I've been a Torch user for more than a year and I had very high hopes on Tensorflow.