[BC-BREAKING] Fix batch norm multiplier init #13774

Kaixhin · 2018-11-09T15:39:03Z

Fixes #12259, needs to make sure tests (see #13766) don't break due to numerical precision issues. Not sure what would need to be adjusted here...

ezyang · 2018-11-15T22:41:26Z

But it seems the tests are still failing:

Nov 09 17:50:06 
Nov 09 17:50:06 =================================== FAILURES ===================================
Nov 09 17:50:06 ___________________________ TestModels.test_densenet ___________________________
Nov 09 17:50:06 
Nov 09 17:50:06 self = <test_models.TestModels testMethod=test_densenet>
Nov 09 17:50:06 
Nov 09 17:50:06     def test_densenet(self):
Nov 09 17:50:06         # Densenet-121 model
Nov 09 17:50:06         x = Variable(torch.randn(BATCH_SIZE, 3, 224, 224).fill_(1.0))
Nov 09 17:50:06 >       self.exportTest(toC(densenet121()), toC(x))
Nov 09 17:50:06 
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/test_models.py:149: 
Nov 09 17:50:06 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/test_models.py:51: in exportTest
Nov 09 17:50:06     verify(model, inputs, backend, rtol=rtol, atol=atol)
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:445: in verify
Nov 09 17:50:06     run(randomize_args(args))
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:425: in run
Nov 09 17:50:06     run_helper(torch_out, args)
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:439: in run_helper
Nov 09 17:50:06     errs.checkAlmostEqual(x.data.cpu().numpy(), y, "In output {}".format(i))
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:60: in checkAlmostEqual
Nov 09 17:50:06     self.almostEqualAndThen(x, y, msg, self.addErr)
Nov 09 17:50:06 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
Nov 09 17:50:06 
Nov 09 17:50:06 self = <verify.Errors object at 0x7f2ae2da5c50>
Nov 09 17:50:06 x = array([[ 0.11339295,  0.52664566, -0.5282817 , ...,  0.60019535,
Nov 09 17:50:06         -0.70....5219744 , ...,  0.604505  ,
Nov 09 17:50:06         -0.7020387 ,  0.06566978]], dtype=float32)
Nov 09 17:50:06 y = array([[ 0.11339286,  0.5266453 , -0.5282816 , ...,  0.6001954 ,
Nov 09 17:50:06         -0.70....5219741 , ...,  0.60450494,
Nov 09 17:50:06         -0.70203865,  0.06566991]], dtype=float32)
Nov 09 17:50:06 msg = 'In output 0'
Nov 09 17:50:06 k = <bound method Errors.addErr of <verify.Errors object at 0x7f2ae2da5c50>>
Nov 09 17:50:06 
Nov 09 17:50:06     def almostEqualAndThen(self, x, y, msg, k):
Nov 09 17:50:06         """
Nov 09 17:50:06             Helper for implementing 'requireAlmostEqual' and 'checkAlmostEqual'.
Nov 09 17:50:06             Upon failure, invokes continuation 'k' with the error message.
Nov 09 17:50:06     
Nov 09 17:50:06             At the moment, only tests on 'numpy.ndarray' are supported.
Nov 09 17:50:06             """
Nov 09 17:50:06         if isinstance(x, np.ndarray) and isinstance(y, np.ndarray):
Nov 09 17:50:06             try:
Nov 09 17:50:06 >               np.testing.assert_allclose(x, y, rtol=self.rtol, atol=self.atol, equal_nan=False, verbose=True)
Nov 09 17:50:06 E               AssertionError: 
Nov 09 17:50:06 E               Not equal to tolerance rtol=0.01, atol=1e-07
Nov 09 17:50:06 E               
Nov 09 17:50:06 E               (mismatch 0.05%)
Nov 09 17:50:06 E                x: array([ 0.113393,  0.526646, -0.528282, ...,  0.604505, -0.702039,
Nov 09 17:50:06 E                       0.06567 ], dtype=float32)
Nov 09 17:50:06 E                y: array([ 0.113393,  0.526645, -0.528282, ...,  0.604505, -0.702039,
Nov 09 17:50:06 E                       0.06567 ], dtype=float32)
Nov 09 17:50:06 
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:71: AssertionError
Nov 09 17:50:06 ______________________ TestCaffe2BackendEmbed.test_resnet ______________________
Nov 09 17:50:06 
Nov 09 17:50:06 self = <test_pytorch_onnx_caffe2.TestCaffe2BackendEmbed testMethod=test_resnet>
Nov 09 17:50:06 
Nov 09 17:50:06     def test_resnet(self):
Nov 09 17:50:06         state_dict = model_zoo.load_url(model_urls['resnet50'], progress=False)
Nov 09 17:50:06         self.run_model_test(resnet50(), train=False, batch_size=BATCH_SIZE,
Nov 09 17:50:06 >                           state_dict=state_dict, atol=1e-6)
Nov 09 17:50:06 
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/test_pytorch_onnx_caffe2.py:405: 
Nov 09 17:50:06 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/test_pytorch_onnx_caffe2.py:186: in run_model_test
Nov 09 17:50:06     example_outputs=example_outputs)
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/test_pytorch_onnx_caffe2.py:177: in run_actual_test
Nov 09 17:50:06     verify.verify(model, input, c2, rtol=rtol, atol=atol)
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:445: in verify
Nov 09 17:50:06     run(randomize_args(args))
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:425: in run
Nov 09 17:50:06     run_helper(torch_out, args)
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:439: in run_helper
Nov 09 17:50:06     errs.checkAlmostEqual(x.data.cpu().numpy(), y, "In output {}".format(i))
Nov 09 17:50:06 /var/lib/jenkins/workspace/test/onnx/verify.py:60: in checkAlmostEqual
Nov 09 17:50:06     self.almostEqualAndThen(x, y, msg, self.addErr)
Nov 09 17:50:06 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
Nov 09 17:50:06 
Nov 09 17:50:06 self = <verify.Errors object at 0x7f2a8e023ed0>
Nov 09 17:50:06 x = array([[-2.312549  , -0.6950716 , -0.26810262, ..., -1.5320209 ,
Nov 09 17:50:06         -1.64....25414407, ..., -1.3733028 ,
Nov 09 17:50:06         -1.3013324 ,  2.2900457 ]], dtype=float32)
Nov 09 17:50:06 y = array([[-2.3125494 , -0.69507074, -0.26810348, ..., -1.5320204 ,
Nov 09 17:50:06         -1.64....254143  , ..., -1.3733065 ,
Nov 09 17:50:06         -1.301334  ,  2.290045  ]], dtype=float32)
Nov 09 17:50:06 msg = 'In output 0'
Nov 09 17:50:06 k = <bound method Errors.addErr of <verify.Errors object at 0x7f2a8e023ed0>>
Nov 09 17:50:06 
Nov 09 17:50:06     def almostEqualAndThen(self, x, y, msg, k):
Nov 09 17:50:06         """
Nov 09 17:50:06             Helper for implementing 'requireAlmostEqual' and 'checkAlmostEqual'.
Nov 09 17:50:06             Upon failure, invokes continuation 'k' with the error message.
Nov 09 17:50:06     
Nov 09 17:50:06             At the moment, only tests on 'numpy.ndarray' are supported.
Nov 09 17:50:06             """
Nov 09 17:50:06         if isinstance(x, np.ndarray) and isinstance(y, np.ndarray):
Nov 09 17:50:06             try:
Nov 09 17:50:06 >               np.testing.assert_allclose(x, y, rtol=self.rtol, atol=self.atol, equal_nan=False, verbose=True)
Nov 09 17:50:06 E               AssertionError: 
Nov 09 17:50:06 E               Not equal to tolerance rtol=0.001, atol=1e-06
Nov 09 17:50:06 E               
Nov 09 17:50:06 E               (mismatch 0.05%)
Nov 09 17:50:06 E                x: array([-2.312549, -0.695072, -0.268103, ..., -1.373303, -1.301332,
Nov 09 17:50:06 E                       2.290046], dtype=float32)
Nov 09 17:50:06 E                y: array([-2.312549, -0.695071, -0.268103, ..., -1.373307, -1.301334,
Nov 09 17:50:06 E                       2.290045], dtype=float32)
Nov 09 17:50:06

Kaixhin · 2018-11-15T23:19:12Z

Yep it seems like numerical precision issues, but I'm not entirely sure where these are creeping in/what the solution to the tests should be. Was the tolerance set too high previously, or is there a genuine problem somewhere in the backend?

ezyang · 2018-12-06T05:49:54Z

I don't know. Some investigation will be needed.

numerical precision problems

ezyang · 2019-06-06T18:06:45Z

Well, it looks like densenet was disabled on master, so we might be able to land this :>

Signed-off-by: Edward Z. Yang <[email protected]>

ezyang · 2019-06-07T02:07:50Z

@pytorchbot retest this please

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-06-07T17:41:30Z

@ezyang merged this pull request in c604658.

Kaixhin added 2 commits November 9, 2018 15:37

Fix batch norm multiplier init

f13b8ba

Update batchnorm expect files

58c527a

yinghai requested a review from bddppq November 9, 2018 18:40

ezyang previously approved these changes Nov 15, 2018

View reviewed changes

fmassa previously approved these changes Nov 20, 2018

View reviewed changes

zou3519 added the awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it label Dec 11, 2018

ezyang added the open source label Jun 5, 2019

ezyang removed the awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it label Jun 6, 2019

Merge remote-tracking branch 'origin/master' into bn_fix_init_v2

6d664e0

pytorchbot added module: nn Related to torch.nn module: onnx Related to torch.onnx labels Jun 6, 2019

ezyang approved these changes Jun 6, 2019

View reviewed changes

Update operators tests.

5b4ae83

Signed-off-by: Edward Z. Yang <[email protected]>

ezyang changed the title ~~Fix batch norm multiplier init~~ [BC-BREAKING] Fix batch norm multiplier init Jun 7, 2019

facebook-github-bot reviewed Jun 7, 2019

View reviewed changes

facebook-github-bot closed this in c604658 Jun 7, 2019

facebook-github-bot added the merged label Jun 7, 2019

gchanan added the module: bc-breaking Related to a BC-breaking change label Aug 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BC-BREAKING] Fix batch norm multiplier init #13774

[BC-BREAKING] Fix batch norm multiplier init #13774

Uh oh!

Kaixhin commented Nov 9, 2018 •

edited

Loading

Uh oh!

ezyang commented Nov 15, 2018

Uh oh!

Kaixhin commented Nov 15, 2018

Uh oh!

ezyang commented Dec 6, 2018

Uh oh!

ezyang commented Jun 6, 2019

Uh oh!

ezyang commented Jun 7, 2019

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Jun 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[BC-BREAKING] Fix batch norm multiplier init #13774

[BC-BREAKING] Fix batch norm multiplier init #13774

Uh oh!

Conversation

Kaixhin commented Nov 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Nov 15, 2018

Uh oh!

Kaixhin commented Nov 15, 2018

Uh oh!

ezyang commented Dec 6, 2018

Uh oh!

ezyang commented Jun 6, 2019

Uh oh!

ezyang commented Jun 7, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 7, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Kaixhin commented Nov 9, 2018 •

edited

Loading