Skip to content

Conversation

@ezyang
Copy link
Contributor

@ezyang ezyang commented Mar 5, 2022

Stack from ghstack:

This code has traveled a long and winding road (doo doo). First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code. Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions. We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang [email protected]

This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <[email protected]>

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 5, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/b1cfcc558374c13f941a12903cf6d0acd9a628cb/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 5, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 8c8ecc2 (more details on the Dr. CI page):


  • 8/8 failures introduced in this PR

🕵️ 8 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 1, 2, linux.4xlarge.nvidia.gpu) (1/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T06:20:40.6770665Z FAIL [0.002s]: tes...e_inference_cuda (__main__.TestTensorCreationCUDA)
2022-03-06T06:20:40.6597354Z   test_zeros_dtype_layout_device_match_cuda_complex64 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6611090Z   test_zeros_dtype_layout_device_match_cuda_float16 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6624831Z   test_zeros_dtype_layout_device_match_cuda_float32 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6638414Z   test_zeros_dtype_layout_device_match_cuda_int16 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6652016Z   test_zeros_dtype_layout_device_match_cuda_int64 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6666089Z   test_zeros_dtype_layout_device_match_cuda_uint8 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:20:40.6696037Z   test_zeros_dtype_out_match_cuda (__main__.TestTensorCreationCUDA) ... ok (0.003s)
2022-03-06T06:20:40.6766191Z   test_zeros_out_cuda (__main__.TestTensorCreationCUDA) ... ok (0.007s)
2022-03-06T06:20:40.6766792Z 
2022-03-06T06:20:40.6766981Z ======================================================================
2022-03-06T06:20:40.6770665Z FAIL [0.002s]: test_tensor_factory_gpu_type_inference_cuda (__main__.TestTensorCreationCUDA)
2022-03-06T06:20:40.6771583Z ----------------------------------------------------------------------
2022-03-06T06:20:40.6772317Z Traceback (most recent call last):
2022-03-06T06:20:40.6772904Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
2022-03-06T06:20:40.6773556Z     method(*args, **kwargs)
2022-03-06T06:20:40.6774774Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-03-06T06:20:40.6775174Z     result = test(self, **param_kwargs)
2022-03-06T06:20:40.6775995Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 918, in only_fn
2022-03-06T06:20:40.6776529Z     return fn(slf, *args, **kwargs)
2022-03-06T06:20:40.6776907Z   File "test_tensor_creation_ops.py", line 2896, in test_tensor_factory_gpu_type_inference
2022-03-06T06:20:40.6777323Z     self.assertEqual(torch.device(device), torch.tensor(0.).device)

See GitHub Actions build win-vs2019-cuda11.3-py3 / test (default, 1, 2, windows.8xlarge.nvidia.gpu) (2/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T08:40:40.0685456Z RuntimeError: test_linalg failed!
2022-03-06T08:40:39.3668291Z FAILED (errors=4, skipped=109, expected failures=12)
2022-03-06T08:40:39.3668687Z 
2022-03-06T08:40:39.3669071Z Generating XML reports...
2022-03-06T08:40:39.3670091Z Generated XML report: test-reports\python-unittest\test_linalg\TEST-TestLinalgCPU-20220306083548.xml
2022-03-06T08:40:39.3671614Z Generated XML report: test-reports\python-unittest\test_linalg\TEST-TestLinalgCUDA-20220306083548.xml
2022-03-06T08:40:40.0682060Z Traceback (most recent call last):
2022-03-06T08:40:40.0683215Z   File "run_test.py", line 1047, in <module>
2022-03-06T08:40:40.0683684Z     main()
2022-03-06T08:40:40.0684280Z   File "run_test.py", line 1025, in main
2022-03-06T08:40:40.0684855Z     raise RuntimeError(err_message)
2022-03-06T08:40:40.0685456Z RuntimeError: test_linalg failed!
2022-03-06T08:40:40.4424729Z 
2022-03-06T08:40:40.4425791Z (base) C:\actions-runner\_work\pytorch\pytorch\test>if ERRORLEVEL 1 goto fail 
2022-03-06T08:40:40.4427839Z 
2022-03-06T08:40:40.4428725Z (base) C:\actions-runner\_work\pytorch\pytorch\test>exit /b 1 
2022-03-06T08:40:40.4463128Z + cleanup
2022-03-06T08:40:40.4463651Z + retcode=1
2022-03-06T08:40:40.4463986Z + set +x
2022-03-06T08:40:40.4501420Z ##[error]Process completed with exit code 1.
2022-03-06T08:40:40.4686967Z ##[group]Run # -ir => recursive include all files in pattern
2022-03-06T08:40:40.4687796Z �[36;1m# -ir => recursive include all files in pattern�[0m

See GitHub Actions build linux-bionic-rocm4.5-py3.7 / test (default, 2, 2, linux.rocm.gpu) (3/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T06:20:07.5870528Z RuntimeError: test_autograd failed!
2022-03-06T06:20:04.8821868Z Generated XML report: test-reports/python-unittest/test_autograd/TEST-TestAutogradForwardMode-20220306061937.xml
2022-03-06T06:20:04.8823686Z Generated XML report: test-reports/python-unittest/test_autograd/TEST-TestAutogradForwardModeBatchedGrad-20220306061937.xml
2022-03-06T06:20:04.8880676Z Generated XML report: test-reports/python-unittest/test_autograd/TEST-TestAutogradFunctional-20220306061937.xml
2022-03-06T06:20:04.8896211Z Generated XML report: test-reports/python-unittest/test_autograd/TEST-TestAutogradInferenceMode-20220306061937.xml
2022-03-06T06:20:04.8899664Z Generated XML report: test-reports/python-unittest/test_autograd/TEST-TestMultithreadAutograd-20220306061937.xml
2022-03-06T06:20:07.5857268Z Traceback (most recent call last):
2022-03-06T06:20:07.5858074Z   File "test/run_test.py", line 1047, in <module>
2022-03-06T06:20:07.5868115Z     main()
2022-03-06T06:20:07.5869084Z   File "test/run_test.py", line 1025, in main
2022-03-06T06:20:07.5869824Z     raise RuntimeError(err_message)
2022-03-06T06:20:07.5870528Z RuntimeError: test_autograd failed!
2022-03-06T06:20:08.6391238Z 
2022-03-06T06:20:08.6392252Z real	2m12.995s
2022-03-06T06:20:08.6393142Z user	10m16.193s
2022-03-06T06:20:08.6393684Z sys	1m4.003s
2022-03-06T06:20:08.6394176Z + cleanup
2022-03-06T06:20:08.6394661Z + retcode=1
2022-03-06T06:20:08.6395145Z + set +x
2022-03-06T06:20:08.6515032Z ##[error]Process completed with exit code 1.
2022-03-06T06:20:08.6588484Z ##[group]Run python3 -m pip install junitparser==2.1.1 rich==10.9.0
2022-03-06T06:20:08.6589045Z �[36;1mpython3 -m pip install junitparser==2.1.1 rich==10.9.0�[0m

See GitHub Actions build win-vs2019-cuda11.3-py3 / test (default, 2, 2, windows.8xlarge.nvidia.gpu) (4/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T08:55:52.7603365Z RuntimeError: test_sparse failed!
2022-03-06T08:55:52.1458650Z Generated XML report: test-reports\dist-gloo\test_sparse\TEST-TestSparseCPU-20220306085426.xml
2022-03-06T08:55:52.1460310Z Generated XML report: test-reports\dist-gloo\test_sparse\TEST-TestSparseCUDA-20220306085426.xml
2022-03-06T08:55:52.1461891Z Generated XML report: test-reports\dist-gloo\test_sparse\TEST-TestSparseOneOff-20220306085426.xml
2022-03-06T08:55:52.1463696Z Generated XML report: test-reports\dist-gloo\test_sparse\TEST-TestSparseUnaryUfuncsCPU-20220306085426.xml
2022-03-06T08:55:52.1465677Z Generated XML report: test-reports\dist-gloo\test_sparse\TEST-TestSparseUnaryUfuncsCUDA-20220306085426.xml
2022-03-06T08:55:52.7599869Z Traceback (most recent call last):
2022-03-06T08:55:52.7601018Z   File "run_test.py", line 1047, in <module>
2022-03-06T08:55:52.7601543Z     main()
2022-03-06T08:55:52.7602173Z   File "run_test.py", line 1025, in main
2022-03-06T08:55:52.7602766Z     raise RuntimeError(err_message)
2022-03-06T08:55:52.7603365Z RuntimeError: test_sparse failed!
2022-03-06T08:55:53.1124629Z 
2022-03-06T08:55:53.1125666Z (base) C:\actions-runner\_work\pytorch\pytorch\test>popd
2022-03-06T08:55:53.1131756Z 
2022-03-06T08:55:53.1132578Z (base) C:\actions-runner\_work\pytorch\pytorch>if ERRORLEVEL 1 exit /b 1 
2022-03-06T08:55:53.1162701Z + cleanup
2022-03-06T08:55:53.1163385Z + retcode=1
2022-03-06T08:55:53.1163971Z + set +x
2022-03-06T08:55:53.1200294Z ##[error]Process completed with exit code 1.
2022-03-06T08:55:53.1508473Z ##[group]Run # -ir => recursive include all files in pattern
2022-03-06T08:55:53.1509327Z �[36;1m# -ir => recursive include all files in pattern�[0m

See GitHub Actions build linux-bionic-py3.7-clang9 / test (noarch, 1, 1, linux.2xlarge) (5/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T07:01:38.4824490Z RuntimeError: test_sparse_csr failed!
2022-03-06T07:01:38.0595676Z 
2022-03-06T07:01:38.0595759Z Generating XML reports...
2022-03-06T07:01:38.1716107Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCSRCPU-20220306070128.xml
2022-03-06T07:01:38.2713236Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCSRMETA-20220306070128.xml
2022-03-06T07:01:38.2716492Z Generated XML report: test-reports/python-unittest/test_sparse_csr/TEST-TestSparseCSRSampler-20220306070128.xml
2022-03-06T07:01:38.4818847Z Traceback (most recent call last):
2022-03-06T07:01:38.4819306Z   File "test/run_test.py", line 1047, in <module>
2022-03-06T07:01:38.4821985Z     main()
2022-03-06T07:01:38.4822359Z   File "test/run_test.py", line 1025, in main
2022-03-06T07:01:38.4824114Z     raise RuntimeError(err_message)
2022-03-06T07:01:38.4824490Z RuntimeError: test_sparse_csr failed!
2022-03-06T07:01:38.6718359Z 
2022-03-06T07:01:38.6718678Z real	57m41.136s
2022-03-06T07:01:38.6719000Z user	143m46.899s
2022-03-06T07:01:38.6719322Z sys	12m28.496s
2022-03-06T07:01:38.6719515Z + cleanup
2022-03-06T07:01:38.6719674Z + retcode=1
2022-03-06T07:01:38.6719827Z + set +x
2022-03-06T07:01:38.6761774Z ##[error]Process completed with exit code 1.
2022-03-06T07:01:38.6822972Z ##[group]Run # Ensure the working directory gets chowned back to the current user
2022-03-06T07:01:38.6823390Z �[36;1m# Ensure the working directory gets chowned back to the current user�[0m

See GitHub Actions build linux-xenial-py3.7-clang7-asan / test (default, 1, 3, linux.2xlarge) (6/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T06:05:39.9777406Z SUMMARY: Undefined.../jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in
2022-03-06T06:05:39.9271621Z     #10 0x55cad4a85801 in run_mod /tmp/build/80754af9/python_1627392990942/work/Python/pythonrun.c:1037
2022-03-06T06:05:39.9272325Z     #11 0x55cad4a907a9 in PyRun_StringFlags /tmp/build/80754af9/python_1627392990942/work/Python/pythonrun.c:961
2022-03-06T06:05:39.9273155Z     #12 0x55cad4a9080b in PyRun_SimpleStringFlags /tmp/build/80754af9/python_1627392990942/work/Python/pythonrun.c:455
2022-03-06T06:05:39.9275778Z     #13 0x55cad4a90908 in pymain_run_command /tmp/build/80754af9/python_1627392990942/work/Modules/main.c:420
2022-03-06T06:05:39.9276368Z     #14 0x55cad4a90908 in pymain_run_python /tmp/build/80754af9/python_1627392990942/work/Modules/main.c:2907
2022-03-06T06:05:39.9276747Z     #15 0x55cad4a90908 in pymain_main /tmp/build/80754af9/python_1627392990942/work/Modules/main.c:3460
2022-03-06T06:05:39.9277291Z     #16 0x55cad4a90ccb in _Py_UnixMain /tmp/build/80754af9/python_1627392990942/work/Modules/main.c:3495
2022-03-06T06:05:39.9776401Z     #17 0x7fe91bca783f in __libc_start_main /build/glibc-S7Ft5T/glibc-2.23/csu/../csu/libc-start.c:291
2022-03-06T06:05:39.9776943Z     #18 0x55cad4a35554 in _start (/opt/conda/bin/python3.7+0x1d7554)
2022-03-06T06:05:39.9777100Z 
2022-03-06T06:05:39.9777406Z SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in 
2022-03-06T06:05:40.0069193Z + retcode=1
2022-03-06T06:05:40.0069627Z + set -e
2022-03-06T06:05:40.0069834Z + return 1
2022-03-06T06:05:40.0070308Z + [[ linux-xenial-py3.7-clang7-asan-default == *-NO_AVX-* ]]
2022-03-06T06:05:40.0070719Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X ]]
2022-03-06T06:05:40.0071191Z + [[ linux-xenial-py3.7-clang7-asan-default == *-NO_AVX2-* ]]
2022-03-06T06:05:40.0071574Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]]
2022-03-06T06:05:40.0072037Z + [[ linux-xenial-py3.7-clang7-asan-default == *-NO_AVX512-* ]]
2022-03-06T06:05:40.0072441Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\5\1\2 ]]
2022-03-06T06:05:40.0072912Z + [[ linux-xenial-py3.7-clang7-asan-default == *tbb* ]]

See GitHub Actions build linux-bionic-rocm4.5-py3.7 / test (default, 1, 2, linux.rocm.gpu) (7/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T06:19:53.3877958Z FAIL [0.002s]: tes...e_inference_cuda (__main__.TestTensorCreationCUDA)
2022-03-06T06:19:53.3727537Z   test_zeros_dtype_layout_device_match_cuda_complex64 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3738749Z   test_zeros_dtype_layout_device_match_cuda_float16 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3750544Z   test_zeros_dtype_layout_device_match_cuda_float32 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3761939Z   test_zeros_dtype_layout_device_match_cuda_int16 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3773552Z   test_zeros_dtype_layout_device_match_cuda_int64 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3785053Z   test_zeros_dtype_layout_device_match_cuda_uint8 (__main__.TestTensorCreationCUDA) ... ok (0.001s)
2022-03-06T06:19:53.3812110Z   test_zeros_dtype_out_match_cuda (__main__.TestTensorCreationCUDA) ... ok (0.002s)
2022-03-06T06:19:53.3875841Z   test_zeros_out_cuda (__main__.TestTensorCreationCUDA) ... ok (0.006s)
2022-03-06T06:19:53.3876674Z 
2022-03-06T06:19:53.3876998Z ======================================================================
2022-03-06T06:19:53.3877958Z FAIL [0.002s]: test_tensor_factory_gpu_type_inference_cuda (__main__.TestTensorCreationCUDA)
2022-03-06T06:19:53.3879612Z ----------------------------------------------------------------------
2022-03-06T06:19:53.3880545Z Traceback (most recent call last):
2022-03-06T06:19:53.3881895Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
2022-03-06T06:19:53.3882884Z     method(*args, **kwargs)
2022-03-06T06:19:53.3884264Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-03-06T06:19:53.3885526Z     result = test(self, **param_kwargs)
2022-03-06T06:19:53.3887016Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 918, in only_fn
2022-03-06T06:19:53.3888023Z     return fn(slf, *args, **kwargs)
2022-03-06T06:19:53.3888985Z   File "test_tensor_creation_ops.py", line 2896, in test_tensor_factory_gpu_type_inference
2022-03-06T06:19:53.3890083Z     self.assertEqual(torch.device(device), torch.tensor(0.).device)

See GitHub Actions build linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (8/8)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-06T07:28:49.0535627Z RuntimeError: test_linalg failed!
2022-03-06T07:28:48.6207668Z 
2022-03-06T07:28:48.6207813Z FAILED (errors=4, skipped=44, expected failures=12)
2022-03-06T07:28:48.6208029Z 
2022-03-06T07:28:48.6208160Z Generating XML reports...
2022-03-06T07:28:48.7002360Z Generated XML report: test-reports/python-unittest/test_linalg/TEST-TestLinalgCUDA-20220306072305.xml
2022-03-06T07:28:49.0529304Z Traceback (most recent call last):
2022-03-06T07:28:49.0529728Z   File "test/run_test.py", line 1047, in <module>
2022-03-06T07:28:49.0532236Z     main()
2022-03-06T07:28:49.0532512Z   File "test/run_test.py", line 1025, in main
2022-03-06T07:28:49.0534987Z     raise RuntimeError(err_message)
2022-03-06T07:28:49.0535627Z RuntimeError: test_linalg failed!
2022-03-06T07:28:49.7465093Z + cleanup
2022-03-06T07:28:49.7465511Z + retcode=1
2022-03-06T07:28:49.7465763Z + set +x
2022-03-06T07:28:49.7532892Z ##[error]Process completed with exit code 1.
2022-03-06T07:28:49.7579192Z ##[group]Run # Ensure the working directory gets chowned back to the current user
2022-03-06T07:28:49.7579685Z �[36;1m# Ensure the working directory gets chowned back to the current user�[0m
2022-03-06T07:28:49.7580111Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2022-03-06T07:28:49.7594863Z shell: /usr/bin/bash -e {0}
2022-03-06T07:28:49.7595102Z env:
2022-03-06T07:28:49.7595421Z   BUILD_ENVIRONMENT: linux-xenial-cuda11.3-py3.7-gcc7

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@ezyang ezyang requested review from bdhirsh, gchanan and ngimel March 5, 2022 16:35
This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Mar 5, 2022
This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 61b565b
Pull Request resolved: #73824
This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Mar 6, 2022
This code has traveled a long and winding road (doo doo).  First,
some back story: originally, all of this code was written in term of
Backend, ScalarType and Device, and things were, well, reasonable.
But then Backend became DispatchKey, and then it became TensorOptions
but someone (ahem, me) was too lazy to migrate the separate ScalarType
and Device into the TensorOptions, because it would involve editing
a lot of code.  Well I have FINALLY made good on the promised check,
by absorbing ScalarType and Device into the TensorOptions.  We rely
heavily on the optional fields in TensorOptions; the idea is that if
it is not set, we are in a context where we aren't expecting any
particular scalar type / device; otherwise we require them to be set
particularly.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 1d148e2
Pull Request resolved: #73824
@ezyang
Copy link
Contributor Author

ezyang commented Mar 7, 2022

This PR has gotten out of control, I may need to shelve it

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label May 22, 2022
@github-actions github-actions bot closed this Jun 21, 2022
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1093/head branch July 21, 2022 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants