report an error if num_channels is not divisible by num_groups for nn.GroupNorm#74293
report an error if num_channels is not divisible by num_groups for nn.GroupNorm#74293XiaobingSuper wants to merge 3 commits intopytorch:masterfrom
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 6b4b676 (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
jbschlosser
left a comment
There was a problem hiding this comment.
Thanks for the fix! I'm good with this.
Do you mind rebasing on master - that should hopefully fix the failing checks.
…efine nn.GroupNorm
8851a0c to
edbcb36
Compare
|
@XiaobingSuper Looks like the entry for |
daeda5b to
6b4b676
Compare
|
@pytorchbot merge this please |
|
Hey @XiaobingSuper. |
….GroupNorm (#74293) Summary: For a GroupNorm module, if num_channels is not divisible by num_groups, we need to report an error when defining a module other than at the running step. example: ``` import torch m = torch.nn.GroupNorm(5, 6) x = torch.randn(1, 6, 4, 4) y = m(x) ``` before: ``` Traceback (most recent call last): File "group_norm_test.py", line 8, in <module> y = m(x) File "/home/xiaobinz/miniconda3/envs/pytorch_mater/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1111, in _call_impl return forward_call(*input, **kwargs) File "/home/xiaobinz/miniconda3/envs/pytorch_mater/lib/python3.7/site-packages/torch/nn/modules/normalization.py", line 271, in forward input, self.num_groups, self.weight, self.bias, self.eps) File "/home/xiaobinz/miniconda3/envs/pytorch_mater/lib/python3.7/site-packages/torch/nn/functional.py", line 2500, in group_norm return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: Expected number of channels in input to be divisible by num_groups, but got input of shape [1, 6, 4, 4] and num_groups=5 ``` after: ``` Traceback (most recent call last): File "group_norm_test.py", line 6, in <module> m = torch.nn.GroupNorm(5, 6) File "/home/xiaobinz/miniconda3/envs/pytorch_test/lib/python3.7/site-packages/torch/nn/modules/normalization.py", line 251, in __init__ raise ValueError('num_channels must be divisible by num_groups') ``` This PR also update the doc of num_groups. Pull Request resolved: #74293 Approved by: https://github.com/jbschlosser Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/4a0f6e6c530b624bb5a4fbcaebe3fd43b1ff66c3 Reviewed By: malfet, osalpekar Differential Revision: D34968521 fbshipit-source-id: dde19a3f642924cbd851ee59c6f52aa4ee37cf0c
|
Hey @XiaobingSuper, thanks for making this change! It seems like this PR may have broken some tests in the PyTorch Opacus project, however (see tests here: https://github.com/pytorch/opacus/tree/main/opacus/tests). We'd love it if you could help address some of these test failures. Looping in @JohnlNguyen from the Opacus team who could provide some context. Going forward, we'd love to find a way to provide this signal to you in advance. |
|
Hey @XiaobingSuper, Do you mind sending in the fix for Opacus? I think it should be a quick fix. |
Yes, there has a fix for Opacus: meta-pytorch/opacus#390 |
The main root cause was that torch.nn.GroupNorm changed its behavior on newer pyTorch versions (pytorch/pytorch#74293). this PR fixes the RetinaNetHead according to the new behavior Also, add_export_config was not exposed when caffe2 was not compiled along with pytorch, preventing users with legacy scripts to call it. This PR exposes it and adds a warning message informing users about its deprecation on future versions This PR also adds torch_export_onnx.py with several unit tests for detectron2 models. ONNX is added as dependency to the CI to allow the aforementioned tests to run Lastly, because detectron2 CI still uses old pytorch versions (1.8, 1.9 and 1.10), this PR adds custom symbolic functions to fix bugs on these versions
The main root cause was that torch.nn.GroupNorm changed its behavior on newer pyTorch versions (pytorch/pytorch#74293). this PR fixes the RetinaNetHead according to the new behavior Also, add_export_config was not exposed when caffe2 was not compiled along with pytorch, preventing users with legacy scripts to call it. This PR exposes it and adds a warning message informing users about its deprecation on future versions This PR also adds torch_export_onnx.py with several unit tests for detectron2 models. ONNX is added as dependency to the CI to allow the aforementioned tests to run Lastly, because detectron2 CI still uses old pytorch versions (1.8, 1.9 and 1.10), this PR adds custom symbolic functions to fix bugs on these versions
The main root cause was that torch.nn.GroupNorm changed its behavior on newer pyTorch versions (pytorch/pytorch#74293). this PR fixes the RetinaNetHead according to the new behavior Also, add_export_config was not exposed when caffe2 was not compiled along with pytorch, preventing users with legacy scripts to call it. This PR exposes it and adds a warning message informing users about its deprecation on future versions This PR also adds torch_export_onnx.py with several unit tests for detectron2 models. ONNX is added as dependency to the CI to allow the aforementioned tests to run Lastly, because detectron2 CI still uses old pytorch versions (1.8, 1.9 and 1.10), this PR adds custom symbolic functions to fix bugs on these versions
For a GroupNorm module, if num_channels is not divisible by num_groups, we need to report an error when defining a module other than at the running step.
example:
before:
after:
This PR also update the doc of num_groups.