incorrect grad of gluon.nn.BatchNorm when scale=False


When using gluon.nn.BatchNorm(scale=False) on gpu,  the computed grad for beta is not correct. The grad of beta seem to be accumulated between iterations. 

When setting scale=True or running on cpu, it goes correctly.

This problem may make network hard to converge during trainning.

## Environment info (Required)
CentOS Linux release 7.2.1511 (Core)
GTX 1080Ti
Driver Version: 384.69
CUDA Version 9.0.176

installed with pip:
numpy                              1.17.2
mxnet-cu90                         1.5.0



## Code
In this example, the grad of beta shuold be [1, 1, 1] at each iteration.

```python
import mxnet as mx
from mxnet import gluon, autograd

ctx = mx.gpu()
x = mx.nd.ones((1,3,1,1), ctx=ctx)

net = gluon.nn.BatchNorm(scale=False, epsilon=2e-5, momentum=0.0)
net.initialize(ctx=ctx)
trainer = gluon.Trainer(params=net.collect_params(),
                        optimizer='sgd',
                        optimizer_params={'learning_rate': 0.01, 'wd': 0.0005, 'momentum': 0.9})
net.hybridize()

for i in range(10):
    with autograd.record():
        out = net(x)
    out.backward()
    trainer.step(x.shape[0])
    for name, param in net.collect_params().items():
        if 'beta' in name:
            print(name, param.grad(ctx).asnumpy())
```
output:

```
batchnorm0_beta [1. 1. 1.]
batchnorm0_beta [2. 2. 2.]
batchnorm0_beta [3. 3. 3.]
batchnorm0_beta [4. 4. 4.]
batchnorm0_beta [5. 5. 5.]
batchnorm0_beta [6. 6. 6.]
batchnorm0_beta [7. 7. 7.]
batchnorm0_beta [8. 8. 8.]
batchnorm0_beta [9. 9. 9.]
batchnorm0_beta [10. 10. 10.]
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

incorrect grad of gluon.nn.BatchNorm when scale=False #16297

Environment info (Required)

Code

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

incorrect grad of gluon.nn.BatchNorm when scale=False #16297

Description

Environment info (Required)

Code

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions