Skip to content

Conversation

@qingyunqu
Copy link

Related #55070

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 24, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 75eb171 (more details on the Dr. CI page):


  • 8/8 failures possibly* introduced in this PR
    • 1/8 non-scanned failure(s)

🕵️ 7 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_6_clang9_noarch_test (1/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 13:56:46 RuntimeError: test_ops failed!
Jul 01 13:56:44 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestCommonMETA-20210701134549.xml
Jul 01 13:56:44 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsCPU-20210701134549.xml
Jul 01 13:56:44 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitCPU-20210701134549.xml
Jul 01 13:56:44 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsMETA-20210701134549.xml
Jul 01 13:56:44 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitMETA-20210701134549.xml
Jul 01 13:56:46 Traceback (most recent call last):
Jul 01 13:56:46   File "test/run_test.py", line 1308, in <module>
Jul 01 13:56:46     main()
Jul 01 13:56:46   File "test/run_test.py", line 1287, in main
Jul 01 13:56:46     raise RuntimeError(err_message)
Jul 01 13:56:46 RuntimeError: test_ops failed!
Jul 01 13:56:47 
Jul 01 13:56:47 real	49m15.370s
Jul 01 13:56:47 user	54m51.805s
Jul 01 13:56:47 sys	8m12.952s
Jul 01 13:56:47 + cleanup
Jul 01 13:56:47 + retcode=1
Jul 01 13:56:47 + set +x
Jul 01 13:56:47 =================== sccache compilation log ===================
Jul 01 13:56:47 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 01 13:56:47 Compile requests                      87

See CircleCI build pytorch_macos_10_13_py3_test (2/7)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Jul 01 13:48:33 RuntimeError: test_ops failed!
Jul 01 13:48:28 
Jul 01 13:48:28 Generating XML reports...
Jul 01 13:48:29 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestCommonCPU-20210701133054.xml
Jul 01 13:48:30 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsCPU-20210701133054.xml
Jul 01 13:48:30 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitCPU-20210701133054.xml
Jul 01 13:48:33 Traceback (most recent call last):
Jul 01 13:48:33   File "test/run_test.py", line 1308, in <module>
Jul 01 13:48:33     main()
Jul 01 13:48:33   File "test/run_test.py", line 1287, in main
Jul 01 13:48:33     raise RuntimeError(err_message)
Jul 01 13:48:33 RuntimeError: test_ops failed!
Jul 01 13:48:34 + cleanup
Jul 01 13:48:34 + retcode=1
Jul 01 13:48:34 + set +x


Exited with code exit status 1

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_test2 (3/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 16:32:56 RuntimeError: test_ops failed!
Jul 01 16:32:53 
Jul 01 16:32:53 Generating XML reports...
Jul 01 16:32:53 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestCommonCUDA-20210701161553.xml
Jul 01 16:32:53 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsCUDA-20210701161553.xml
Jul 01 16:32:53 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitCUDA-20210701161553.xml
Jul 01 16:32:56 Traceback (most recent call last):
Jul 01 16:32:56   File "test/run_test.py", line 1308, in <module>
Jul 01 16:32:56     main()
Jul 01 16:32:56   File "test/run_test.py", line 1287, in main
Jul 01 16:32:56     raise RuntimeError(err_message)
Jul 01 16:32:56 RuntimeError: test_ops failed!
Jul 01 16:32:56 + cleanup
Jul 01 16:32:56 + retcode=1
Jul 01 16:32:56 + set +x
Jul 01 16:32:56 =================== sccache compilation log ===================
Jul 01 16:32:57 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 01 16:32:57 Compile requests                      0
Jul 01 16:32:57 Compile requests executed             0
Jul 01 16:32:57 Cache hits                            0
Jul 01 16:32:57 Cache misses                          0
Jul 01 16:32:57 Cache timeouts                        0

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test1 (4/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 14:42:39 RuntimeError: test_ops failed!
Jul 01 14:42:31 
Jul 01 14:42:31 Generating XML reports...
Jul 01 14:42:31 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCommonCPU-20210701134747.xml
Jul 01 14:42:31 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestGradientsCPU-20210701134747.xml
Jul 01 14:42:31 Generated XML report: test-reports/python-unittest/test_ops/TEST-TestJitCPU-20210701134747.xml
Jul 01 14:42:39 Traceback (most recent call last):
Jul 01 14:42:39   File "test/run_test.py", line 1308, in <module>
Jul 01 14:42:39     main()
Jul 01 14:42:39   File "test/run_test.py", line 1287, in main
Jul 01 14:42:39     raise RuntimeError(err_message)
Jul 01 14:42:39 RuntimeError: test_ops failed!
Jul 01 14:42:39 + cleanup
Jul 01 14:42:39 + retcode=1
Jul 01 14:42:39 + set +x
Jul 01 14:42:39 =================== sccache compilation log ===================
Jul 01 14:42:39 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 01 14:42:40 Compile requests                      0
Jul 01 14:42:40 Compile requests executed             0
Jul 01 14:42:40 Cache hits                            0
Jul 01 14:42:40 Cache misses                          0
Jul 01 14:42:40 Cache timeouts                        0

See CircleCI build pytorch_linux_bionic_cuda10_2_cudnn7_py3_9_gcc7_test2 (5/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 15:33:44 RuntimeError: test_ops failed!
Jul 01 15:33:41 
Jul 01 15:33:41 Generating XML reports...
Jul 01 15:33:42 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestCommonCUDA-20210701151722.xml
Jul 01 15:33:42 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsCUDA-20210701151722.xml
Jul 01 15:33:42 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitCUDA-20210701151722.xml
Jul 01 15:33:44 Traceback (most recent call last):
Jul 01 15:33:44   File "/var/lib/jenkins/workspace/test/run_test.py", line 1308, in <module>
Jul 01 15:33:44     main()
Jul 01 15:33:44   File "/var/lib/jenkins/workspace/test/run_test.py", line 1287, in main
Jul 01 15:33:44     raise RuntimeError(err_message)
Jul 01 15:33:44 RuntimeError: test_ops failed!
Jul 01 15:33:45 
Jul 01 15:33:45 real	46m20.230s
Jul 01 15:33:45 user	47m2.085s
Jul 01 15:33:45 sys	17m7.770s
Jul 01 15:33:45 + cleanup
Jul 01 15:33:45 + retcode=1
Jul 01 15:33:45 + set +x
Jul 01 15:33:45 =================== sccache compilation log ===================
Jul 01 15:33:45 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 01 15:33:45 Compile requests                      0

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_test1 (6/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 17:12:32 AssertionError: False is not true : Scalars failed to compare as equal! 0 != 1
Jul 01 17:12:32   test_math_autogradcpu (__main__.TestPythonDispatcher) ... ok (0.001s)
Jul 01 17:12:32 
Jul 01 17:12:32 ======================================================================
Jul 01 17:12:32 FAIL [0.001s]: test_find_dangling_impls (__main__.TestPythonDispatcher)
Jul 01 17:12:32 ----------------------------------------------------------------------
Jul 01 17:12:32 Traceback (most recent call last):
Jul 01 17:12:32   File "test_dispatch.py", line 891, in test_find_dangling_impls
Jul 01 17:12:32     self.assertEqual(0, len(C._dispatch_find_dangling_impls()))
Jul 01 17:12:32   File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1498, in assertEqual
Jul 01 17:12:32     super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
Jul 01 17:12:32 AssertionError: False is not true : Scalars failed to compare as equal! 0 != 1
Jul 01 17:12:32 
Jul 01 17:12:32 ----------------------------------------------------------------------
Jul 01 17:12:32 Ran 29 tests in 23.670s
Jul 01 17:12:32 
Jul 01 17:12:32 FAILED (failures=1)
Jul 01 17:12:32 
Jul 01 17:12:32 Generating XML reports...
Jul 01 17:12:32 Generated XML report: test-reports/dist-gloo/test_dispatch/TEST-TestDispatch-20210701171209.xml
Jul 01 17:12:32 Generated XML report: test-reports/dist-gloo/test_dispatch/TEST-TestPythonDispatcher-20210701171209.xml
Jul 01 17:12:32 Traceback (most recent call last):

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (7/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 01 14:14:50 RuntimeError: test_ops failed!
Jul 01 14:14:48 
Jul 01 14:14:48 Generating XML reports...
Jul 01 14:14:48 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestCommonCPU-20210701140109.xml
Jul 01 14:14:48 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestGradientsCPU-20210701140109.xml
Jul 01 14:14:48 Generated XML report: test-reports/dist-gloo/test_ops/TEST-TestJitCPU-20210701140109.xml
Jul 01 14:14:50 Traceback (most recent call last):
Jul 01 14:14:50   File "test/run_test.py", line 1308, in <module>
Jul 01 14:14:50     main()
Jul 01 14:14:50   File "test/run_test.py", line 1287, in main
Jul 01 14:14:50     raise RuntimeError(err_message)
Jul 01 14:14:50 RuntimeError: test_ops failed!
Jul 01 14:14:51 =================== sccache compilation log ===================
Jul 01 14:14:51 + cleanup
Jul 01 14:14:51 + retcode=1
Jul 01 14:14:51 + set +x
Jul 01 14:14:51 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jul 01 14:14:51 Compile requests                      0
Jul 01 14:14:51 Compile requests executed             0
Jul 01 14:14:51 Cache hits                            0
Jul 01 14:14:51 Cache misses                          0
Jul 01 14:14:51 Cache timeouts                        0

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@qingyunqu
Copy link
Author

@ezyang There is some behavior different before and after structured. Following is a case:

M = torch.randn(1)
batch1 = torch.randn(10, 3, 4)
batch2 = torch.randn(10, 4, 5)
M.addbmm_(batch1, batch2)

Before structured, M will be broadcasted firstly. But After structured, this will throw runtime error. This also happened in at::addmm(), related in #57417.

How should I deal with this?

@ezyang
Copy link
Contributor

ezyang commented Jun 25, 2021

@qingyunqu Based on how you stated it, I think the new behavior is preferred. When we do inplace operations we are not supposed to broadcast inplace. I'm a little puzzled by the test failures though, do you know what's going on there?

@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 25, 2021
@qingyunqu
Copy link
Author

@qingyunqu Based on how you stated it, I think the new behavior is preferred. When we do inplace operations we are not supposed to broadcast inplace. I'm a little puzzled by the test failures though, do you know what's going on there?

I just worry about the compatibility breaking. And I will check the test failures tomorrow.

@facebook-github-bot
Copy link
Contributor

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ezyang
Copy link
Contributor

ezyang commented Jul 1, 2021

yeah ok looks like we broke tests

Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 378, in instantiated_test
    raise rte
  File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 373, in instantiated_test
    result = test(self, **param_kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 780, in test_wrapper
    return test(*args, **kwargs)
  File "test_ops.py", line 427, in test_conj_view
    inplace_forward = inplace_variant(cloned2, *sample.args, **sample.kwargs)
RuntimeError: The input tensor must be a matrix with size 5x10, but got a 1-D tensor with size 1x0

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Apr 13, 2022
@github-actions github-actions bot closed this May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed open source Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants