Skip to content

Synchronize mean/variance of batch normalization when training with multiple GPUs  #1071

@bearpaw

Description

@bearpaw

Hi everyone,

Recently, I find that when training with multiple GPUs using the DataParallelTable, the performance is slightly worse than training with a single GPU. It seems that the problem is caused by that the mean and variance of batch normalization are not synchronized across different GPUs. Does anybody know the solution for this issue? Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions