Skip to content

Conversation

@dagitses
Copy link
Collaborator

@dagitses dagitses commented May 21, 2022

Stack from ghstack (oldest at bottom):

This was never intended to be supported.

Differential Revision: D36567054

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 21, 2022

🔗 Helpful links

❌ 4 New Failures

As of commit 1ff1089 (more details on the Dr. CI page):

Expand to see more
  • 4/4 failures introduced in this PR

🕵️ 4 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 2, 4, linux.4xlarge.nvidia.gpu) (1/4)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-06-16T11:07:53.4454925Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:07:53.4450178Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1184, in set_rng_seed
2022-06-16T11:07:53.4450594Z     torch.manual_seed(seed)
2022-06-16T11:07:53.4451070Z   File "/opt/conda/lib/python3.7/site-packages/torch/random.py", line 40, in manual_seed
2022-06-16T11:07:53.4451437Z     torch.cuda.manual_seed_all(seed)
2022-06-16T11:07:53.4451945Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 113, in manual_seed_all
2022-06-16T11:07:53.4452705Z     _lazy_call(cb, seed_all=True)
2022-06-16T11:07:53.4453354Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 156, in _lazy_call
2022-06-16T11:07:53.4453718Z     callable()
2022-06-16T11:07:53.4454166Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 111, in cb
2022-06-16T11:07:53.4454536Z     default_generator.manual_seed(seed)
2022-06-16T11:07:53.4454925Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:07:53.4455421Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2022-06-16T11:07:53.4455904Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2022-06-16T11:07:53.4456108Z 
2022-06-16T11:07:53.4456392Z ----------------------------------------------------------------------
2022-06-16T11:07:53.4456742Z Ran 150 tests in 37.407s
2022-06-16T11:07:53.4456915Z 
2022-06-16T11:07:53.4457066Z FAILED (errors=1, expected failures=3)
2022-06-16T11:07:53.4457271Z 
2022-06-16T11:07:53.4457379Z Generating XML reports...
2022-06-16T11:07:53.4647227Z Generated XML report: test-reports/python-unittest/test_decomp/TEST-TestDecompCUDA-20220616110715.xml

See GitHub Actions build pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 1, 4, linux.4xlarge.nvidia.gpu) (2/4)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-06-16T11:07:35.7116923Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:07:35.7112265Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1184, in set_rng_seed
2022-06-16T11:07:35.7112656Z     torch.manual_seed(seed)
2022-06-16T11:07:35.7113145Z   File "/opt/conda/lib/python3.7/site-packages/torch/random.py", line 40, in manual_seed
2022-06-16T11:07:35.7113826Z     torch.cuda.manual_seed_all(seed)
2022-06-16T11:07:35.7114481Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 113, in manual_seed_all
2022-06-16T11:07:35.7114856Z     _lazy_call(cb, seed_all=True)
2022-06-16T11:07:35.7115346Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 156, in _lazy_call
2022-06-16T11:07:35.7115709Z     callable()
2022-06-16T11:07:35.7116135Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 111, in cb
2022-06-16T11:07:35.7116538Z     default_generator.manual_seed(seed)
2022-06-16T11:07:35.7116923Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:07:35.7117401Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2022-06-16T11:07:35.7117887Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2022-06-16T11:07:35.7118116Z 
2022-06-16T11:07:35.7118414Z ----------------------------------------------------------------------
2022-06-16T11:07:35.7118764Z Ran 125 tests in 27.892s
2022-06-16T11:07:35.7118941Z 
2022-06-16T11:07:35.7119088Z FAILED (errors=1, skipped=8, expected failures=3)
2022-06-16T11:07:35.7119300Z 
2022-06-16T11:07:35.7119434Z Generating XML reports...
2022-06-16T11:07:35.7281939Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCommonCUDA-20220616110707.xml

See GitHub Actions build pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 3, 4, linux.4xlarge.nvidia.gpu) (3/4)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-06-16T11:05:17.1346222Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:05:17.1342007Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1184, in set_rng_seed
2022-06-16T11:05:17.1342400Z     torch.manual_seed(seed)
2022-06-16T11:05:17.1342866Z   File "/opt/conda/lib/python3.7/site-packages/torch/random.py", line 40, in manual_seed
2022-06-16T11:05:17.1343242Z     torch.cuda.manual_seed_all(seed)
2022-06-16T11:05:17.1343720Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 113, in manual_seed_all
2022-06-16T11:05:17.1344281Z     _lazy_call(cb, seed_all=True)
2022-06-16T11:05:17.1344752Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 156, in _lazy_call
2022-06-16T11:05:17.1345076Z     callable()
2022-06-16T11:05:17.1345505Z   File "/opt/conda/lib/python3.7/site-packages/torch/cuda/random.py", line 111, in cb
2022-06-16T11:05:17.1345878Z     default_generator.manual_seed(seed)
2022-06-16T11:05:17.1346222Z RuntimeError: CUDA error: an illegal memory access was encountered
2022-06-16T11:05:17.1346700Z CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
2022-06-16T11:05:17.1347154Z For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
2022-06-16T11:05:17.1347370Z 
2022-06-16T11:05:17.1347712Z ----------------------------------------------------------------------
2022-06-16T11:05:17.1348039Z Ran 150 tests in 18.236s
2022-06-16T11:05:17.1348207Z 
2022-06-16T11:05:17.1348346Z FAILED (errors=1, expected failures=3)
2022-06-16T11:05:17.1348534Z 
2022-06-16T11:05:17.1348657Z Generating XML reports...
2022-06-16T11:05:17.1529664Z Generated XML report: test-reports/python-unittest/test_meta/TEST-TestMetaCUDA-20220616110458.xml

See GitHub Actions build pull / pytorch-xla-linux-bionic-py3.7-clang8 / test (xla, 1, 1, linux.2xlarge) (4/4)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-06-16T11:02:55.5236582Z ##[error]Process completed with exit code 1.
2022-06-16T11:02:55.2914320Z �[1A�[K�[32mINFO: �[0m0 processes.
2022-06-16T11:02:55.2914808Z �[32mLoading:�[0m 0 packages loaded
2022-06-16T11:02:55.2920246Z 
2022-06-16T11:02:55.2920942Z �[1A�[K�[31m�[1mFAILED:�[0m Build did NOT complete successfully (0 packages loaded)
2022-06-16T11:02:55.2943945Z 
2022-06-16T11:02:55.2948254Z �[1A�[K�[31m�[1mFAILED:�[0m Build did NOT complete successfully (0 packages loaded)
2022-06-16T11:02:55.3074593Z �[0mFailed to build external libraries: ['/var/lib/jenkins/workspace/xla/build_torch_xla_libs.sh', '-O', '-D_GLIBCXX_USE_CXX11_ABI=1', 'install']
2022-06-16T11:02:55.5195285Z + cleanup
2022-06-16T11:02:55.5195598Z + retcode=1
2022-06-16T11:02:55.5195838Z + set +x
2022-06-16T11:02:55.5236582Z ##[error]Process completed with exit code 1.
2022-06-16T11:02:55.5281685Z Prepare all required actions
2022-06-16T11:02:55.5282012Z Getting action download info
2022-06-16T11:02:55.6998119Z ##[group]Run ./.github/actions/get-workflow-job-id
2022-06-16T11:02:55.6998337Z with:
2022-06-16T11:02:55.6998661Z   github-token: ***
2022-06-16T11:02:55.6998814Z env:
2022-06-16T11:02:55.6998983Z   GIT_DEFAULT_BRANCH: master
2022-06-16T11:02:55.6999165Z ##[endgroup]
2022-06-16T11:02:55.7025441Z ##[group]Run nick-fields/retry@71062288b76e2b6214ebde0e673ce0de1755740a
2022-06-16T11:02:55.7025677Z with:

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

dagitses pushed a commit that referenced this pull request May 21, 2022
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 156982636

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)
dagitses pushed a commit that referenced this pull request May 22, 2022
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 156982636

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)
dagitses pushed a commit that referenced this pull request Jun 15, 2022
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 157009727

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)
dagitses pushed a commit that referenced this pull request Jun 16, 2022
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 157009727

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)
dagitses pushed a commit that referenced this pull request Jun 16, 2022
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 159124122


@override-unit-failures
(Note: this ignores all push blocking failures!)

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)
@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link
Contributor

Hey @dagitses.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jun 16, 2022
Summary:
Pull Request resolved: #78033

This was never intended to be supported.
ghstack-source-id: 159124122

(Note: this ignores all push blocking failures!)

Test Plan: Rely on CI.

Reviewed By: kit1980

Differential Revision: D36567054

fbshipit-source-id: 9141d189fb532c2732dc54cfcfa3817eb006b37b
@facebook-github-bot facebook-github-bot deleted the gh/dagitses/349/head branch June 20, 2022 14:17
justinchuby pushed a commit to justinchuby/pytorch that referenced this pull request Jul 27, 2022
Pull Request resolved: pytorch#78033

This was never intended to be supported.

@override-unit-failures
(Note: this ignores all push blocking failures!)

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)

Approved by: https://github.com/kit1980
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Oct 29, 2022
Pull Request resolved: pytorch/pytorch#78033

This was never intended to be supported.

@override-unit-failures
(Note: this ignores all push blocking failures!)

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)

Approved by: https://github.com/kit1980
jjsjann123 pushed a commit to jjsjann123/nvfuser that referenced this pull request Nov 10, 2022
Pull Request resolved: pytorch/pytorch#78033

This was never intended to be supported.

@override-unit-failures
(Note: this ignores all push blocking failures!)

Differential Revision: [D36567054](https://our.internmc.facebook.com/intern/diff/D36567054/)

Approved by: https://github.com/kit1980
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants