add quantized layer norm implementation #35329

vkuzo · 2020-03-24T22:51:13Z

Stack from ghstack:

add quantized layer norm implementation #35329 add quantized layer norm implementation
Add fast sums and sums of squares over quantized ranges to QuantizedOpKernels.cpp #35693 Add fast sums and sums of squares over quantized ranges to QuantizedOpKernels.cpp

Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D20768930

Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

dr-ci · 2020-03-24T23:05:20Z

💊 CircleCI build failures summary and remediations

As of commit 56b9575 (more details on the Dr. CI page):

2/3 failures introduced in this PR
1/3 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

🕵️ 1 new failure recognized by patterns

The following build failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_clang5_asan_test (1/1)

Step: "Test" (full log | pattern match details | 🔁 rerun)

Apr 09 08:13:19 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in

Apr 09 08:13:19     #168 0x555ed10df74b in PyEval_EvalCode /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:731 
Apr 09 08:13:19     #169 0x555ed115f633 in run_mod /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:1025 
Apr 09 08:13:19     #170 0x555ed115fa30 in PyRun_FileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:978 
Apr 09 08:13:19     #171 0x555ed115fc32 in PyRun_SimpleFileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:419 
Apr 09 08:13:19     #172 0x555ed1163722 in run_file /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:340 
Apr 09 08:13:19     #173 0x555ed1163722 in Py_Main /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:811 
Apr 09 08:13:19     #174 0x555ed102e1fd in main /tmp/build/80754af9/python_1585002248360/work/Programs/python.c:69 
Apr 09 08:13:19     #175 0x7fd1a663582f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 
Apr 09 08:13:19     #176 0x555ed110cc29 in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Apr 09 08:13:19  
Apr 09 08:13:19 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in  
Apr 09 08:13:19 Traceback (most recent call last): 
Apr 09 08:13:19   File "test/run_test.py", line 695, in <module> 
Apr 09 08:13:19     main() 
Apr 09 08:13:19   File "test/run_test.py", line 688, in main 
Apr 09 08:13:19     raise RuntimeError(message) 
Apr 09 08:13:19 RuntimeError: quantization/test_quantized failed! 
Apr 09 08:13:19 + cleanup 
Apr 09 08:13:19 + retcode=1 
Apr 09 08:13:19 + set +x 
Apr 09 08:13:19 =================== sccache compilation log ===================

1 job timed out:

pytorch_linux_xenial_py3_6_gcc5_4_test

❄️ 1 tentatively flaky failure

1 failure tentatively classified as flaky but have not triggered reruns to confirm:

pytorch_xla_linux_xenial_py3_6_clang7_build (1/1)

Step: "Build" (full log | pattern match details | 🔁 rerun) ❄️

Apr 09 05:04:43 E: Unable to correct problems, you have held broken packages.

Apr 09 05:04:33 2020-04-09 05:04:33 (94.6 MB/s) - written to stdout [3145/3145] 
Apr 09 05:04:33  
Apr 09 05:04:33 OK 
Apr 09 05:04:33 + sudo apt-get -qq update 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease  Could not resolve 'security.ubuntu.com' 
Apr 09 05:04:43 W: Some index files failed to download. They have been ignored, or old ones used instead. 
Apr 09 05:04:43 + sudo apt-get -qq install clang-8 clang++-8 
Apr 09 05:04:43 E: Unable to correct problems, you have held broken packages. 
Apr 09 05:04:43 + cleanup 
Apr 09 05:04:43 =================== sccache compilation log =================== 
Apr 09 05:04:43 + retcode=100 
Apr 09 05:04:43 + set +x 
Apr 09 05:04:43 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Apr 09 05:04:43 Compile requests              5491 
Apr 09 05:04:43 Compile requests executed     3306 
Apr 09 05:04:43 Cache hits                    3294 
Apr 09 05:04:43 Cache misses                     0 
Apr 09 05:04:43 Cache timeouts                   0

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 76 times.

Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 3c3721f Pull Request resolved: #35329

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation quantized benchmark: https://gist.github.com/vkuzo/a0487403200cb3d4a641b8aa4371069c Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d3b95b3 Pull Request resolved: #35329

vkuzo · 2020-03-26T03:35:37Z

actually, let me see if we can vectorize the mean+var calculation better

jamesr66a

Looks fine overall, but yeah vectorizing mean+var is probably the right thing to do. Right now the perf numbers don't look great

jamesr66a · 2020-03-26T04:26:42Z

aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp

 REGISTER_DISPATCH(fake_quant_grad_tensor_stub, &fake_quantize_grad_tensor_kernel);
 REGISTER_DISPATCH(fake_quant_per_channel_stub, &fake_quant_per_channel_cpu);
 REGISTER_DISPATCH(fake_quant_grad_per_channel_stub, &fake_quant_grad_per_channel_cpu);
+REGISTER_DISPATCH(LayerNormKernelQuantized, &LayerNormKernelQuantizedImpl);


nit: adjective before noun reads better IMO, so QuantizedLayerNormKernel. Also it would be better to match the style of the surrounding code and name it in snake_case

jamesr66a · 2020-03-26T04:28:29Z

aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp

+
+  DCHECK_EQ(X.numel(), M * N);
+  DCHECK(!gamma.defined() || gamma.numel() == N);
+  DCHECK(!beta.defined() || beta.numel() == N);


Can we make these TORCH_INTERNAL_ASSERT unless we see a performance issue? That way we'll get better error messages and reports from users if there's a bug. Do we expect this kernel to be called in a tight loop?

z-a-f · 2020-03-26T09:16:30Z

cc @mingzhe09088 for the benchmarks

mingzhe09088 · 2020-03-26T17:34:58Z

The benchmarks look good to me. Could you add laynorm_test to benchmarks/operator_benchmark/benchmark_all_other_test.py?

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmark by input size: https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9c53f13 Pull Request resolved: #35329

vkuzo · 2020-03-26T20:50:37Z

Looks fine overall, but yeah vectorizing mean+var is probably the right thing to do. Right now the perf numbers don't look great

Vectorizing it the easier way (all addition in FP) gives around a 2x speedup vs before: https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2

I'm going to see if I can ramp up enough to AVX to implement the additions to happen in integers, would love your thoughts on if that needs to land for this PR or if we can save it as a later optimization.

vkuzo · 2020-03-27T00:35:12Z

I'm going to see if I can ramp up enough to AVX to implement the additions to happen in integers, would love your thoughts on if that needs to land for this PR or if we can save it as a later optimization.

will have this shortly, feel free to wait on review until then (sorry for the churn)

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1: https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2(current): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: bcbf146 Pull Request resolved: #35329

vkuzo · 2020-03-30T18:17:27Z

converted the mean+var calculation to ints, which speeds it up a bit more

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 7f70433 Pull Request resolved: #35329

ezyang · 2020-04-09T21:22:56Z

This test is consistently failing in CI:

Apr 09 21:14:42 ======================================================================
Apr 09 21:14:42 FAIL [19.222s]: test_qlayer_norm (__main__.TestQuantizedOps)
Apr 09 21:14:42 ----------------------------------------------------------------------
Apr 09 21:14:42 Traceback (most recent call last):
Apr 09 21:14:42   File "quantization/test_quantized.py", line 276, in test_qlayer_norm
Apr 09 21:14:42     qparams=hu.qparams(scale_max=1e3),
Apr 09 21:14:42   File "/var/lib/jenkins/.local/lib/python3.6/site-packages/hypothesis/core.py", line 1116, in wrapped_test
Apr 09 21:14:42     raise the_error_hypothesis_found
Apr 09 21:14:42   File "quantization/test_quantized.py", line 346, in test_qlayer_norm
Apr 09 21:14:42     self.assertTrue(pct_diff < 0.001)
Apr 09 21:14:42 AssertionError: False is not true
Apr 09 21:14:42

https://app.circleci.com/pipelines/github/pytorch/pytorch/153008/workflows/09822835-2d4a-416d-8599-8679d95e0dda/jobs/5102005

ezyang · 2020-04-09T21:23:58Z

ASAN suggests you have some sort of division by zero

Apr 09 20:51:53   test_qlayer_norm (__main__.TestQuantizedOps) ... /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7: runtime error: division by zero
Apr 09 20:51:54     #0 0x7f63e4486a76 in THFloatTensor_var_all (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc859a76)
Apr 09 20:51:54     #1 0x7f63e4486c05 in THFloatTensor_std_all (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc859c05)
Apr 09 20:51:54     #2 0x7f63e41032db in at::native::legacy::cpu::_th_std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc4d62db)
Apr 09 20:51:54     #3 0x7f63e3fd9111 in at::CPUType::_std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc3ac111)
Apr 09 20:51:54     #4 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #5 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #6 0x7f63e2f211f9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f41f9)
Apr 09 20:51:54     #7 0x7f63e2f210ad in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f40ad)
Apr 09 20:51:54     #8 0x7f63e2f20dd9 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3dd9)
Apr 09 20:51:54     #9 0x7f63e2f20bbb in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3bbb)
Apr 09 20:51:54     #10 0x7f63e34c5c4e in at::native::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb898c4e)
Apr 09 20:51:54     #11 0x7f63e41bc811 in at::TypeDefault::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc58f811)
Apr 09 20:51:54     #12 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #13 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #14 0x7f63e2f211f9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f41f9)
Apr 09 20:51:54     #15 0x7f63e2f210ad in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f40ad)
Apr 09 20:51:54     #16 0x7f63e2f20dd9 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3dd9)
Apr 09 20:51:54     #17 0x7f63e2f20bbb in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3bbb)
Apr 09 20:51:54     #18 0x7f63e937b1b9 in torch::autograd::VariableType::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0x1174e1b9)
Apr 09 20:51:54     #19 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #20 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #21 0x7f63fd7357b9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d147b9)
Apr 09 20:51:54     #22 0x7f63fd73566d in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d1466d)
Apr 09 20:51:54     #23 0x7f63fd735389 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d14389)
Apr 09 20:51:54     #24 0x7f63fd73516b in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d1416b)
Apr 09 20:51:54     #25 0x7f63fd7c3790 in at::Tensor::std(bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1da2790)
Apr 09 20:51:54     #26 0x7f63fd64ed3d in torch::autograd::THPVariable_std(_object*, _object*, _object*) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1c2dd3d)
Apr 09 20:51:55     #27 0x5653d55e3533 in _PyCFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/methodobject.c:231
Apr 09 20:51:55     #28 0x5653d566aaab in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4851
Apr 09 20:51:55     #29 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #30 0x5653d566af65 in gen_send_ex /tmp/build/80754af9/python_1585002248360/work/Objects/genobject.c:189
Apr 09 20:51:55     #31 0x5653d5624bdd in PyIter_Next /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:3162
Apr 09 20:51:55     #32 0x5653d5627bf0 in builtin_sum_impl.isra.1 /tmp/build/80754af9/python_1585002248360/work/Python/bltinmodule.c:2280
Apr 09 20:51:55     #33 0x5653d5627bf0 in builtin_sum /tmp/build/80754af9/python_1585002248360/work/Python/clinic/bltinmodule.c.h:604
Apr 09 20:51:55     #34 0x5653d55e3470 in _PyCFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/methodobject.c:234
Apr 09 20:51:55     #35 0x5653d566aaab in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4851
Apr 09 20:51:55     #36 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #37 0x5653d5665ff5 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #38 0x5653d5665ff5 in PyEval_EvalCodeEx /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4187
Apr 09 20:51:55     #39 0x5653d56668d5 in function_call /tmp/build/80754af9/python_1585002248360/work/Objects/funcobject.c:604
Apr 09 20:51:55     #40 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #41 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #42 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #43 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #44 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #45 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #46 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #47 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #48 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #49 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #50 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #51 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #52 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #53 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #54 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #55 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #56 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #57 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #58 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #59 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #60 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #61 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #62 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #63 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #64 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #65 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #66 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #67 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #68 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #69 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #70 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #71 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #72 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #73 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #74 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #75 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #76 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #77 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #78 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #79 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #80 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #81 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #82 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #83 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #84 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #85 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #86 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #87 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #88 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #89 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #90 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #91 0x5653d5664195 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #92 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #93 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #94 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #95 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #96 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #97 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #98 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #99 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #100 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #101 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #102 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #103 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #104 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #105 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #106 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #107 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #108 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #109 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #110 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #111 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #112 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #113 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #114 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #115 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #116 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #117 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #118 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #119 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #120 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #121 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #122 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #123 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #124 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #125 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #126 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #127 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #128 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #129 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #130 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #131 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #132 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #133 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #134 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #135 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #136 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #137 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #138 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #139 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #140 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #141 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #142 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #143 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #144 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #145 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #146 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #147 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #148 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #149 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #150 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #151 0x5653d566447a in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #152 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #153 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #154 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #155 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #156 0x5653d563b79a in slot_tp_init /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6420
Apr 09 20:51:55     #157 0x5653d566ade6 in type_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:915
Apr 09 20:51:55     #158 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #159 0x5653d5665189 in _PyObject_FastCallKeywords /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2496
Apr 09 20:51:55     #160 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #161 0x5653d568df59 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3351
Apr 09 20:51:55     #162 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #163 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #164 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #165 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #166 0x5653d56659b8 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #167 0x5653d56659b8 in PyEval_EvalCodeEx /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4187
Apr 09 20:51:55     #168 0x5653d566674b in PyEval_EvalCode /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:731
Apr 09 20:51:55     #169 0x5653d56e6633 in run_mod /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:1025
Apr 09 20:51:55     #170 0x5653d56e6a30 in PyRun_FileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:978
Apr 09 20:51:55     #171 0x5653d56e6c32 in PyRun_SimpleFileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:419
Apr 09 20:51:55     #172 0x5653d56ea722 in run_file /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:340
Apr 09 20:51:55     #173 0x5653d56ea722 in Py_Main /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:811
Apr 09 20:51:55     #174 0x5653d55b51fd in main /tmp/build/80754af9/python_1585002248360/work/Programs/python.c:69
Apr 09 20:51:55     #175 0x7f6411eb582f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
Apr 09 20:51:55     #176 0x5653d5693c29 in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103
Apr 09 20:51:55 
Apr 09 20:51:55 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in 
Apr 09 20:51:55 Traceback (most recent call last):
Apr 09 20:51:55   File "test/run_test.py", line 696, in <module>
Apr 09 20:51:55     main()
Apr 09 20:51:55   File "test/run_test.py", line 689, in main
Apr 09 20:51:55     raise RuntimeError(message)
Apr 09 20:51:55 RuntimeError: quantization/test_quantized failed!

vkuzo · 2020-04-09T21:43:58Z

@ezyang, sorry about that, it was passing on my last run but I guess this is still flaky. Thanks for the info and I'll work on a fix.

facebook-github-bot · 2020-04-10T00:24:44Z

This pull request has been merged in f813e71.

Summary: Pull Request resolved: pytorch#35329 Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Imported from OSS Differential Revision: D20768930 fbshipit-source-id: ddf8727e9840c65ead3b890220af0638c5637028

Summary: Pull Request resolved: #35329 Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Imported from OSS Differential Revision: D20768930 fbshipit-source-id: ddf8727e9840c65ead3b890220af0638c5637028 ghstack-source-id: db9c649

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b [ghstack-poisoned]

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b ghstack-source-id: 4149278 Pull Request resolved: #36593

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b [ghstack-poisoned]

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b ghstack-source-id: b9c2140 Pull Request resolved: #36593

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268) [ghstack-poisoned]

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b ghstack-source-id: 0d681b6 Pull Request resolved: #36593

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268) [ghstack-poisoned]

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b ghstack-source-id: 9bb87ea Pull Request resolved: #36593

Summary: This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268) [ghstack-poisoned]

Summary: Pull Request resolved: #36593 This is a redo of #35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: D21030268 Pulled By: vkuzo fbshipit-source-id: b3594c3393cfce37a881319e2e0560620d51080f

vkuzo requested review from jamesr66a, raghuramank100, supriyar and z-a-f March 25, 2020 21:39

vkuzo self-assigned this Mar 25, 2020

vkuzo added the oncall: quantization Quantization support in PyTorch label Mar 25, 2020

jamesr66a reviewed Mar 26, 2020

View reviewed changes

z-a-f requested a review from mingzhe09088 March 26, 2020 09:16

vkuzo mentioned this pull request Mar 30, 2020

Add fast sums and sums of squares over quantized ranges to QuantizedOpKernels.cpp #35693

Closed

vkuzo requested a review from jamesr66a March 30, 2020 18:16

facebook-github-bot closed this in f813e71 Apr 9, 2020

facebook-github-bot added the merged label Apr 10, 2020

facebook-github-bot deleted the gh/vkuzo/15/head branch April 13, 2020 14:16

vkuzo mentioned this pull request Apr 14, 2020

redo of add quantized layer norm implementation #36593

Closed

mruberry added the Merged label Oct 28, 2020

add quantized layer norm implementation #35329

add quantized layer norm implementation #35329

Uh oh!

Conversation

vkuzo commented Mar 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Mar 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CircleCI build failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_py3_clang5_asan_test (1/1)

❄️ 1 tentatively flaky failure

pytorch_xla_linux_xenial_py3_6_clang7_build (1/1)

Uh oh!

vkuzo commented Mar 26, 2020

Uh oh!

jamesr66a left a comment

Choose a reason for hiding this comment

Uh oh!

jamesr66a Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

jamesr66a Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

z-a-f commented Mar 26, 2020

Uh oh!

mingzhe09088 commented Mar 26, 2020

Uh oh!

vkuzo commented Mar 26, 2020

Uh oh!

vkuzo commented Mar 27, 2020

Uh oh!

vkuzo commented Mar 30, 2020

Uh oh!

ezyang commented Apr 9, 2020

Uh oh!

ezyang commented Apr 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vkuzo commented Apr 9, 2020

Uh oh!

facebook-github-bot commented Apr 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

vkuzo commented Mar 24, 2020 •

edited

Loading

dr-ci bot commented Mar 24, 2020 •

edited

Loading

ezyang commented Apr 9, 2020 •

edited

Loading