Skip to content

[ONNX] Update onnxruntime to 1.9 for CI#65029

Merged
garymm merged 1 commit intopytorch:onnx_ms_1from
garymm:ort-1.9rc
Oct 5, 2021
Merged

[ONNX] Update onnxruntime to 1.9 for CI#65029
garymm merged 1 commit intopytorch:onnx_ms_1from
garymm:ort-1.9rc

Conversation

@garymm
Copy link
Copy Markdown
Collaborator

@garymm garymm commented Sep 15, 2021

No description provided.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Sep 15, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit f49c862 (more details on the Dr. CI page):


  • 8/8 failures introduced in this PR

🕵️ 6 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) (1/6)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-09-28T22:13:29.5104854Z ERROR [0.000s]: test_poisson_sample (__main__.TestDistributions)
2021-09-28T22:13:29.5099135Z   File "distributions/test_distributions.py", line 812, in _check_sampler_discrete
2021-09-28T22:13:29.5099794Z     chisq, p = scipy.stats.chisquare(counts[msk], pmf[msk] * num_samples)
2021-09-28T22:13:29.5100563Z   File "c:\jenkins\miniconda3\lib\site-packages\scipy\stats\stats.py", line 6852, in chisquare
2021-09-28T22:13:29.5101186Z     return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis,
2021-09-28T22:13:29.5101980Z   File "c:\jenkins\miniconda3\lib\site-packages\scipy\stats\stats.py", line 6694, in power_divergence
2021-09-28T22:13:29.5102538Z     raise ValueError(msg)
2021-09-28T22:13:29.5103307Z ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are:
2021-09-28T22:13:29.5104008Z 0.008265582255680495
2021-09-28T22:13:29.5104181Z 
2021-09-28T22:13:29.5104429Z ======================================================================
2021-09-28T22:13:29.5104854Z ERROR [0.000s]: test_poisson_sample (__main__.TestDistributions)
2021-09-28T22:13:29.5105366Z ----------------------------------------------------------------------
2021-09-28T22:13:29.5105919Z Traceback (most recent call last):
2021-09-28T22:13:29.5106592Z   File "distributions/test_distributions.py", line 1352, in test_poisson_sample
2021-09-28T22:13:29.5107139Z     self._check_sampler_discrete(Poisson(rate),
2021-09-28T22:13:29.5107767Z   File "distributions/test_distributions.py", line 812, in _check_sampler_discrete
2021-09-28T22:13:29.5108404Z     chisq, p = scipy.stats.chisquare(counts[msk], pmf[msk] * num_samples)
2021-09-28T22:13:29.5109134Z   File "c:\jenkins\miniconda3\lib\site-packages\scipy\stats\stats.py", line 6852, in chisquare
2021-09-28T22:13:29.5109767Z     return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis,
2021-09-28T22:13:29.5110474Z   File "c:\jenkins\miniconda3\lib\site-packages\scipy\stats\stats.py", line 6694, in power_divergence
2021-09-28T22:13:29.5111040Z     raise ValueError(msg)

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_test (2/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 28 20:30:48 [ FAILED ] TensorTest.TestBatchNorm1D
Sep 28 20:30:48 [ RUN      ] XlaUtilCacheTest.BasicTest
Sep 28 20:30:48 [       OK ] XlaUtilCacheTest.BasicTest (1 ms)
Sep 28 20:30:48 [----------] 1 test from XlaUtilCacheTest (1 ms total)
Sep 28 20:30:48 
Sep 28 20:30:48 [----------] Global test environment tear-down
Sep 28 20:30:48 [==========] 613 tests from 8 test suites ran. (460577 ms total)
Sep 28 20:30:48 [  PASSED  ] 611 tests.
Sep 28 20:30:48 [  SKIPPED ] 1 test, listed below:
Sep 28 20:30:48 [  SKIPPED ] AtenXlaTensorTest.TestGroupNormBackward
Sep 28 20:30:48 [  FAILED  ] 1 test, listed below:
Sep 28 20:30:48 [  FAILED  ] TensorTest.TestBatchNorm1D
Sep 28 20:30:48 
Sep 28 20:30:48  1 FAILED TEST
Sep 28 20:30:48 + cleanup
Sep 28 20:30:48 + retcode=1
Sep 28 20:30:48 + set +x
Sep 28 20:30:48 =================== sccache compilation log ===================
Sep 28 20:30:48 =========== If your build fails, please take a look at the log above for possible reasons ===========
Sep 28 20:30:48 Compile requests                      0
Sep 28 20:30:48 Compile requests executed             0
Sep 28 20:30:48 Cache hits                            0

See CircleCI build pytorch_linux_bionic_cuda10_2_cudnn7_py3_9_gcc7_build (3/6)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Sep 28 18:21:20 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Sep 28 18:21:20 ++++ extract_trap_cmd
Sep 28 18:21:20 ++++ printf '%s\n' ''
Sep 28 18:21:20 +++ printf '%s\n' cleanup
Sep 28 18:21:20 ++ trap -- '
Sep 28 18:21:20 cleanup' EXIT
Sep 28 18:21:20 ++ [[ pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7-build != *win-* ]]
Sep 28 18:21:20 ++ which sccache
Sep 28 18:21:20 ++ sccache --stop-server
Sep 28 18:21:20 ++ true
Sep 28 18:21:20 ++ rm /var/lib/jenkins/sccache_error.log
Sep 28 18:21:20 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Sep 28 18:21:20 ++ true
Sep 28 18:21:20 ++ [[ -n '' ]]
Sep 28 18:21:20 ++ [[ pytorch-linux-bionic-cuda10.2-cudnn7-py3.9-gcc7-build == *rocm* ]]
Sep 28 18:21:20 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Sep 28 18:21:20 ++ SCCACHE_IDLE_TIMEOUT=1200
Sep 28 18:21:20 ++ RUST_LOG=sccache::server=error
Sep 28 18:21:20 ++ sccache --start-server
Sep 28 18:21:20 sccache: Starting the server...
Sep 28 18:21:20 ++ sccache --zero-stats
Sep 28 18:21:20 Compile requests                      0

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (4/6)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Sep 28 18:30:56 *** WARNING: renaming "_hashlib...open shared object file: No such file or directory
Sep 28 18:30:56 gcc -pthread -shared build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_ssl.o -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib -lssl -lcrypto -o build/lib.linux-x86_64-3.8/_ssl.cpython-38-x86_64-linux-gnu.so
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/2x4c8-minmax-gemmlowp-xop-ld64.c.o
Sep 28 18:30:56 building '_hashlib' extension
Sep 28 18:30:56 gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -fPIC -fPIC -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I./Include -I/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/include -I. -I/usr/include/x86_64-linux-gnu -I/usr/local/include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Include -I/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython -c /var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.c -o build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/2x4c8-minmax-gemmlowp-xop-ld128.c.o
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/2x4c8-xw-minmax-gemmlowp-xop.c.o
Sep 28 18:30:56 gcc -pthread -shared build/temp.linux-x86_64-3.8/var/lib/jenkins/workspace/build/torch/csrc/deploy/interpreter/cpython/src/cpython/Modules/_hashopenssl.o -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/var/lib/jenkins/workspace/torch/csrc/deploy/interpreter/cpython/lib -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib -lssl -lcrypto -o build/lib.linux-x86_64-3.8/_hashlib.cpython-38-x86_64-linux-gnu.so
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/3x4c2-minmax-fp32-xop-ld64.c.o
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/3x4c2-minmax-fp32-xop-ld128.c.o
Sep 28 18:30:56 *** WARNING: renaming "_ssl" since importing it failed: libssl.so.1.1: cannot open shared object file: No such file or directory
Sep 28 18:30:56 *** WARNING: renaming "_hashlib" since importing it failed: libcrypto.so.1.1: cannot open shared object file: No such file or directory
Sep 28 18:30:56 [ 51%] Building C object confu-deps/XNNPACK/CMakeFiles/XNNPACK.dir/src/qs8-gemm/gen/3x4c2-minmax-gemmlowp-xop-ld64.c.o
Sep 28 18:30:56 
Sep 28 18:30:56 The following modules found by detect_modules() in setup.py, have been
Sep 28 18:30:56 built by the Makefile instead, as configured by the Setup files:
Sep 28 18:30:56 _abc                  atexit                pwd                
Sep 28 18:30:56 time                                                           
Sep 28 18:30:56 
Sep 28 18:30:56 
Sep 28 18:30:56 Following modules built successfully but were removed because they could not be imported:
Sep 28 18:30:56 _hashlib              _ssl                                     

See CircleCI build pytorch_linux_backward_compatibility_check_test (5/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 28 18:51:00 The PR is introducing backward ...m to confirm whether this change is wanted or not.
Sep 28 18:51:00 processing existing schema:  alltoall_base(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor _1, Tensor _2, int[] _3, int[] _4) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  alltoall(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, Tensor[] _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  send(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  recv(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  recv_anysource(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  barrier(__torch__.torch.classes.dist_c10d.ProcessGroup _0) -> (__torch__.torch.classes.dist_c10d.Work _0)
Sep 28 18:51:00 processing existing schema:  __init__(__torch__.torch.classes.dist_c10d.frontend _0) -> (NoneType _0)
Sep 28 18:51:00 processing existing schema:  new_process_group_helper(__torch__.torch.classes.dist_c10d.frontend _0, int _1, int _2, int[] _3, str _4, __torch__.torch.classes.dist_c10d.Store _5, str? _6, int _7) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
Sep 28 18:51:00 processing existing schema:  get_process_group_by_name(__torch__.torch.classes.dist_c10d.frontend _0, str _1) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
Sep 28 18:51:00 processing existing schema:  get_name_of_process_group(__torch__.torch.classes.dist_c10d.frontend _0, __torch__.torch.classes.dist_c10d.ProcessGroup _1) -> (str _0)
Sep 28 18:51:00 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
Sep 28 18:51:00 
Sep 28 18:51:00 Broken ops: [
Sep 28 18:51:00 	aten::meshgrid.indexing(Tensor[] tensors, *, str indexing) -> (Tensor[])
Sep 28 18:51:00 	aten::special_gammaincc(Tensor self, Tensor other) -> (Tensor)
Sep 28 18:51:00 	aten::special_gammaincc.out(Tensor self, Tensor other, *, Tensor(a!) out) -> (Tensor(a!))
Sep 28 18:51:00 	aten::special_gammainc(Tensor self, Tensor other) -> (Tensor)
Sep 28 18:51:00 	aten::special_gammainc.out(Tensor self, Tensor other, *, Tensor(a!) out) -> (Tensor(a!))
Sep 28 18:51:00 	aten::nanmean(Tensor self, int[1] dim=[], bool keepdim=False, *, int? dtype=None) -> (Tensor)
Sep 28 18:51:00 	aten::nanmean.out(Tensor self, int[1] dim=[], bool keepdim=False, *, int? dtype=None, Tensor(a!) out) -> (Tensor(a!))
Sep 28 18:51:00 	aten::cross_entropy_loss(Tensor self, Tensor target, Tensor? weight=None, int reduction=1, int ignore_index=-100, float label_smoothing=0.) -> (Tensor)

See CircleCI build pytorch_linux_xenial_py3_clang7_asan_test2 (6/6)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 28 21:49:40 SUMMARY: UndefinedBehaviorSanit.../jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in
Sep 28 21:49:40     #9 0x5580dceb08f2 in PyEval_EvalCode /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval.c:731
Sep 28 21:49:40     #10 0x5580dcf18cd5 in run_mod /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:1025
Sep 28 21:49:40     #11 0x5580dcf1ad5d in PyRun_StringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:949
Sep 28 21:49:40     #12 0x5580dcf1adbb in PyRun_SimpleStringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:445
Sep 28 21:49:40     #13 0x5580dcf1b926 in run_command /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:301
Sep 28 21:49:40     #14 0x5580dcf1b926 in Py_Main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:749
Sep 28 21:49:40     #15 0x5580dce55196 in main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Programs/python.c:69
Sep 28 21:49:40     #16 0x7f6118f4283f in __libc_start_main /build/glibc-S7Ft5T/glibc-2.23/csu/../csu/libc-start.c:291
Sep 28 21:49:40     #17 0x5580dcee533d in _start (/opt/conda/bin/python3.6+0x1a733d)
Sep 28 21:49:40 
Sep 28 21:49:40 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in 
Sep 28 21:49:40 + retcode=1
Sep 28 21:49:40 + set -e
Sep 28 21:49:40 + return 1
Sep 28 21:49:40 + [[ pytorch-linux-xenial-py3-clang7-asan-test2 == *-NO_AVX-* ]]
Sep 28 21:49:40 + [[ '' == \n\o\g\p\u\_\N\O\_\A\V\X ]]
Sep 28 21:49:40 + [[ pytorch-linux-xenial-py3-clang7-asan-test2 == *-NO_AVX2-* ]]
Sep 28 21:49:40 + [[ '' == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]]
Sep 28 21:49:40 + [[ pytorch-linux-xenial-py3-clang7-asan-test2 == *-NO_AVX512-* ]]
Sep 28 21:49:40 + [[ '' == \n\o\g\p\u\_\N\O\_\A\V\X\5\1\2 ]]
Sep 28 21:49:40 + '[' -n '' ']'

2 failures not recognized by patterns:

Job Step Action
CircleCI pytorch_linux_xenial_py3_6_gcc5_4_jit_legacy_test Report results 🔁 rerun
CircleCI pytorch_linux_xenial_py3_6_gcc5_4_test Report results 🔁 rerun

3 jobs timed out:

  • pytorch_linux_bionic_cuda10_2_cudnn7_py3_9_gcc7_build
  • pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build
  • pytorch_linux_xenial_py3_clang7_asan_test2

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@garymm garymm marked this pull request as draft September 15, 2021 00:08
@pytorch-probot
Copy link
Copy Markdown

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/garymm/pytorch/blob/f49c86236d9417986e508d577732d801900b38d2/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.8-gcc9-coverage ciflow/all, ciflow/coverage, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/win ✅ triggered
win-vs2019-cuda10.1-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/win ✅ triggered
Skipped Workflows
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/win 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@garymm garymm changed the title [ONNX] test ORT 1.9 RC [ONNX] Update onnxruntime to 1.9 for CI Sep 28, 2021
@garymm garymm marked this pull request as ready for review September 29, 2021 17:02
@garymm garymm merged commit 098d777 into pytorch:onnx_ms_1 Oct 5, 2021
@garymm garymm deleted the ort-1.9rc branch October 5, 2021 18:33
BowenBao pushed a commit to BowenBao/pytorch that referenced this pull request Oct 19, 2021
BowenBao pushed a commit to BowenBao/pytorch that referenced this pull request Oct 26, 2021
BowenBao pushed a commit that referenced this pull request Oct 26, 2021
facebook-github-bot pushed a commit that referenced this pull request Oct 27, 2021
Summary: Pull Request resolved: #67269

Test Plan: Imported from OSS

Reviewed By: ngimel, msaroufim

Differential Revision: D31962516

Pulled By: malfet

fbshipit-source-id: 39b3c6a4a05d7b769f0ef5ce7ea597209516cde2
malfet pushed a commit that referenced this pull request Dec 8, 2021
Summary: Pull Request resolved: #67269

Test Plan: Imported from OSS

Reviewed By: ngimel, msaroufim

Differential Revision: D31962516

Pulled By: malfet

fbshipit-source-id: 39b3c6a4a05d7b769f0ef5ce7ea597209516cde2
malfet added a commit that referenced this pull request Dec 9, 2021
Summary: Pull Request resolved: #67269

Test Plan: Imported from OSS

Reviewed By: ngimel, msaroufim

Differential Revision: D31962516

Pulled By: malfet

fbshipit-source-id: 39b3c6a4a05d7b769f0ef5ce7ea597209516cde2

Co-authored-by: Gary Miguel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants