Skip to content

Batch Normalization for Multi-GPU / Data Parallelism #7439

@kiranvaidhya

Description

@kiranvaidhya

Where is the batch normalization implementation for Multi-GPU scenarios? How does one keep track of mean, variance, offset and scale in the context of the Multi-GPU example as given in the CIFAR-10 tutorial?

Why is the question on StackOverflow left unanswered for so long?

For all the beauty that it brings with Tensorboard etc.. , it's kinda appalling to see Tensorflow so far behind Torch in terms of its modeling capability. I'd be really glad if someone takes up responsibility and comes up with a decent Batch Normalization implementation for all cases. Even if it is already there, could anyone care enough to make a good documentation out of it?

There are so many issues pertaining to batch normalization with Tensorflow. It's important that you guys straighten this out as batch normalization enables super-fast convergence for very deep networks and it is REALLY important for modern day deep learning research.

PS: Please spare my outburst. I've been a Torch user for more than a year and I had very high hopes on Tensorflow.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions