Skip to content

Conversation

@vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Mar 24, 2020

Stack from ghstack:

Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D20768930

Summary:

Adds a quantized implementation of LayerNorm for server.

Relevant PRs:
* #20345 (floating point LN)
* #33080 (quantized BN)

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link

dr-ci bot commented Mar 24, 2020

💊 CircleCI build failures summary and remediations

As of commit 56b9575 (more details on the Dr. CI page):



🕵️ 1 new failure recognized by patterns

The following build failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test (1/1)

Step: "Test" (full log | pattern match details | 🔁 rerun)

Apr 09 08:13:19 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in
Apr 09 08:13:19     #168 0x555ed10df74b in PyEval_EvalCode /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:731 
Apr 09 08:13:19     #169 0x555ed115f633 in run_mod /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:1025 
Apr 09 08:13:19     #170 0x555ed115fa30 in PyRun_FileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:978 
Apr 09 08:13:19     #171 0x555ed115fc32 in PyRun_SimpleFileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:419 
Apr 09 08:13:19     #172 0x555ed1163722 in run_file /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:340 
Apr 09 08:13:19     #173 0x555ed1163722 in Py_Main /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:811 
Apr 09 08:13:19     #174 0x555ed102e1fd in main /tmp/build/80754af9/python_1585002248360/work/Programs/python.c:69 
Apr 09 08:13:19     #175 0x7fd1a663582f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 
Apr 09 08:13:19     #176 0x555ed110cc29 in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Apr 09 08:13:19  
Apr 09 08:13:19 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in  
Apr 09 08:13:19 Traceback (most recent call last): 
Apr 09 08:13:19   File "test/run_test.py", line 695, in <module> 
Apr 09 08:13:19     main() 
Apr 09 08:13:19   File "test/run_test.py", line 688, in main 
Apr 09 08:13:19     raise RuntimeError(message) 
Apr 09 08:13:19 RuntimeError: quantization/test_quantized failed! 
Apr 09 08:13:19 + cleanup 
Apr 09 08:13:19 + retcode=1 
Apr 09 08:13:19 + set +x 
Apr 09 08:13:19 =================== sccache compilation log =================== 

1 job timed out:

  • pytorch_linux_xenial_py3_6_gcc5_4_test

❄️ 1 tentatively flaky failure

1 failure tentatively classified as flaky but have not triggered reruns to confirm:

See CircleCI build pytorch_xla_linux_xenial_py3_6_clang7_build (1/1)

Step: "Build" (full log | pattern match details | 🔁 rerun) ❄️

Apr 09 05:04:43 E: Unable to correct problems, you have held broken packages.
Apr 09 05:04:33 2020-04-09 05:04:33 (94.6 MB/s) - written to stdout [3145/3145] 
Apr 09 05:04:33  
Apr 09 05:04:33 OK 
Apr 09 05:04:33 + sudo apt-get -qq update 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease  Could not resolve 'archive.ubuntu.com' 
Apr 09 05:04:43 W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease  Could not resolve 'security.ubuntu.com' 
Apr 09 05:04:43 W: Some index files failed to download. They have been ignored, or old ones used instead. 
Apr 09 05:04:43 + sudo apt-get -qq install clang-8 clang++-8 
Apr 09 05:04:43 E: Unable to correct problems, you have held broken packages. 
Apr 09 05:04:43 + cleanup 
Apr 09 05:04:43 =================== sccache compilation log =================== 
Apr 09 05:04:43 + retcode=100 
Apr 09 05:04:43 + set +x 
Apr 09 05:04:43 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Apr 09 05:04:43 Compile requests              5491 
Apr 09 05:04:43 Compile requests executed     3306 
Apr 09 05:04:43 Cache hits                    3294 
Apr 09 05:04:43 Cache misses                     0 
Apr 09 05:04:43 Cache timeouts                   0 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 76 times.

Summary:

Adds a quantized implementation of LayerNorm for server.

Relevant PRs:
* #20345 (floating point LN)
* #33080 (quantized BN)

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

Adds a quantized implementation of LayerNorm for server.

Relevant PRs:
* #20345 (floating point LN)
* #33080 (quantized BN)

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Mar 24, 2020
Summary:

Adds a quantized implementation of LayerNorm for server.

Relevant PRs:
* #20345 (floating point LN)
* #33080 (quantized BN)

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 3c3721f
Pull Request resolved: #35329
Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
quantized benchmark: https://gist.github.com/vkuzo/a0487403200cb3d4a641b8aa4371069c

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Mar 25, 2020
Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: d3b95b3
Pull Request resolved: #35329
@vkuzo vkuzo self-assigned this Mar 25, 2020
@vkuzo vkuzo added the oncall: quantization Quantization support in PyTorch label Mar 25, 2020
@vkuzo
Copy link
Contributor Author

vkuzo commented Mar 26, 2020

actually, let me see if we can vectorize the mean+var calculation better

Copy link
Collaborator

@jamesr66a jamesr66a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine overall, but yeah vectorizing mean+var is probably the right thing to do. Right now the perf numbers don't look great

REGISTER_DISPATCH(fake_quant_grad_tensor_stub, &fake_quantize_grad_tensor_kernel);
REGISTER_DISPATCH(fake_quant_per_channel_stub, &fake_quant_per_channel_cpu);
REGISTER_DISPATCH(fake_quant_grad_per_channel_stub, &fake_quant_grad_per_channel_cpu);
REGISTER_DISPATCH(LayerNormKernelQuantized, &LayerNormKernelQuantizedImpl);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: adjective before noun reads better IMO, so QuantizedLayerNormKernel. Also it would be better to match the style of the surrounding code and name it in snake_case


DCHECK_EQ(X.numel(), M * N);
DCHECK(!gamma.defined() || gamma.numel() == N);
DCHECK(!beta.defined() || beta.numel() == N);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these TORCH_INTERNAL_ASSERT unless we see a performance issue? That way we'll get better error messages and reports from users if there's a bug. Do we expect this kernel to be called in a tight loop?

@z-a-f z-a-f requested a review from mingzhe09088 March 26, 2020 09:16
@z-a-f
Copy link

z-a-f commented Mar 26, 2020

cc @mingzhe09088 for the benchmarks

@mingzhe09088
Copy link
Contributor

The benchmarks look good to me. Could you add laynorm_test to benchmarks/operator_benchmark/benchmark_all_other_test.py?

Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation

benchmark by input size: https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Mar 26, 2020
Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 9c53f13
Pull Request resolved: #35329
@vkuzo
Copy link
Contributor Author

vkuzo commented Mar 26, 2020

Looks fine overall, but yeah vectorizing mean+var is probably the right thing to do. Right now the perf numbers don't look great

Vectorizing it the easier way (all addition in FP) gives around a 2x speedup vs before: https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2

I'm going to see if I can ramp up enough to AVX to implement the additions to happen in integers, would love your thoughts on if that needs to land for this PR or if we can save it as a later optimization.

@vkuzo
Copy link
Contributor Author

vkuzo commented Mar 27, 2020

I'm going to see if I can ramp up enough to AVX to implement the additions to happen in integers, would love your thoughts on if that needs to land for this PR or if we can save it as a later optimization.

will have this shortly, feel free to wait on review until then (sorry for the churn)

Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation

benchmarks by input size: 
v1: https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2(current): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Mar 30, 2020
Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: bcbf146
Pull Request resolved: #35329
@vkuzo vkuzo requested a review from jamesr66a March 30, 2020 18:16
@vkuzo
Copy link
Contributor Author

vkuzo commented Mar 30, 2020

converted the mean+var calculation to ints, which speeds it up a bit more

Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation

benchmarks by input size: 
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 9, 2020
Summary:

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:

numerics match the floating point implementation
TODO: benchmarks

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 7f70433
Pull Request resolved: #35329
@ezyang
Copy link
Contributor

ezyang commented Apr 9, 2020

This test is consistently failing in CI:

Apr 09 21:14:42 ======================================================================
Apr 09 21:14:42 FAIL [19.222s]: test_qlayer_norm (__main__.TestQuantizedOps)
Apr 09 21:14:42 ----------------------------------------------------------------------
Apr 09 21:14:42 Traceback (most recent call last):
Apr 09 21:14:42   File "quantization/test_quantized.py", line 276, in test_qlayer_norm
Apr 09 21:14:42     qparams=hu.qparams(scale_max=1e3),
Apr 09 21:14:42   File "/var/lib/jenkins/.local/lib/python3.6/site-packages/hypothesis/core.py", line 1116, in wrapped_test
Apr 09 21:14:42     raise the_error_hypothesis_found
Apr 09 21:14:42   File "quantization/test_quantized.py", line 346, in test_qlayer_norm
Apr 09 21:14:42     self.assertTrue(pct_diff < 0.001)
Apr 09 21:14:42 AssertionError: False is not true
Apr 09 21:14:42 

https://app.circleci.com/pipelines/github/pytorch/pytorch/153008/workflows/09822835-2d4a-416d-8599-8679d95e0dda/jobs/5102005

@ezyang
Copy link
Contributor

ezyang commented Apr 9, 2020

ASAN suggests you have some sort of division by zero

Apr 09 20:51:53   test_qlayer_norm (__main__.TestQuantizedOps) ... /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7: runtime error: division by zero
Apr 09 20:51:54     #0 0x7f63e4486a76 in THFloatTensor_var_all (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc859a76)
Apr 09 20:51:54     #1 0x7f63e4486c05 in THFloatTensor_std_all (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc859c05)
Apr 09 20:51:54     #2 0x7f63e41032db in at::native::legacy::cpu::_th_std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc4d62db)
Apr 09 20:51:54     #3 0x7f63e3fd9111 in at::CPUType::_std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc3ac111)
Apr 09 20:51:54     #4 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #5 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #6 0x7f63e2f211f9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f41f9)
Apr 09 20:51:54     #7 0x7f63e2f210ad in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f40ad)
Apr 09 20:51:54     #8 0x7f63e2f20dd9 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3dd9)
Apr 09 20:51:54     #9 0x7f63e2f20bbb in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3bbb)
Apr 09 20:51:54     #10 0x7f63e34c5c4e in at::native::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb898c4e)
Apr 09 20:51:54     #11 0x7f63e41bc811 in at::TypeDefault::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xc58f811)
Apr 09 20:51:54     #12 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #13 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #14 0x7f63e2f211f9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f41f9)
Apr 09 20:51:54     #15 0x7f63e2f210ad in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f40ad)
Apr 09 20:51:54     #16 0x7f63e2f20dd9 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3dd9)
Apr 09 20:51:54     #17 0x7f63e2f20bbb in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3bbb)
Apr 09 20:51:54     #18 0x7f63e937b1b9 in torch::autograd::VariableType::std(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0x1174e1b9)
Apr 09 20:51:54     #19 0x7f63e2f20a88 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >::operator()(at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3a88)
Apr 09 20:51:54     #20 0x7f63e2f20911 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor const&, bool), at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, bool> >, at::Tensor (at::Tensor const&, bool)>::call(c10::OperatorKernel*, at::Tensor const&, bool) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_cpu.so+0xb2f3911)
Apr 09 20:51:54     #21 0x7f63fd7357b9 in at::Tensor c10::KernelFunction::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d147b9)
Apr 09 20:51:54     #22 0x7f63fd73566d in at::Tensor c10::Dispatcher::callUnboxedWithDispatchKey<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, c10::DispatchKey, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d1466d)
Apr 09 20:51:54     #23 0x7f63fd735389 in at::Tensor c10::Dispatcher::callUnboxed<at::Tensor, at::Tensor const&, bool>(c10::OperatorHandle const&, at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d14389)
Apr 09 20:51:54     #24 0x7f63fd73516b in at::Tensor c10::OperatorHandle::callUnboxed<at::Tensor, at::Tensor const&, bool>(at::Tensor const&, bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1d1416b)
Apr 09 20:51:54     #25 0x7f63fd7c3790 in at::Tensor::std(bool) const (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1da2790)
Apr 09 20:51:54     #26 0x7f63fd64ed3d in torch::autograd::THPVariable_std(_object*, _object*, _object*) (/opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so+0x1c2dd3d)
Apr 09 20:51:55     #27 0x5653d55e3533 in _PyCFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/methodobject.c:231
Apr 09 20:51:55     #28 0x5653d566aaab in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4851
Apr 09 20:51:55     #29 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #30 0x5653d566af65 in gen_send_ex /tmp/build/80754af9/python_1585002248360/work/Objects/genobject.c:189
Apr 09 20:51:55     #31 0x5653d5624bdd in PyIter_Next /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:3162
Apr 09 20:51:55     #32 0x5653d5627bf0 in builtin_sum_impl.isra.1 /tmp/build/80754af9/python_1585002248360/work/Python/bltinmodule.c:2280
Apr 09 20:51:55     #33 0x5653d5627bf0 in builtin_sum /tmp/build/80754af9/python_1585002248360/work/Python/clinic/bltinmodule.c.h:604
Apr 09 20:51:55     #34 0x5653d55e3470 in _PyCFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/methodobject.c:234
Apr 09 20:51:55     #35 0x5653d566aaab in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4851
Apr 09 20:51:55     #36 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #37 0x5653d5665ff5 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #38 0x5653d5665ff5 in PyEval_EvalCodeEx /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4187
Apr 09 20:51:55     #39 0x5653d56668d5 in function_call /tmp/build/80754af9/python_1585002248360/work/Objects/funcobject.c:604
Apr 09 20:51:55     #40 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #41 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #42 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #43 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #44 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #45 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #46 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #47 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #48 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #49 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #50 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #51 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #52 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #53 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #54 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #55 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #56 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #57 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #58 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #59 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #60 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #61 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #62 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #63 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #64 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #65 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #66 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #67 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #68 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #69 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #70 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #71 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #72 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #73 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #74 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #75 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #76 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #77 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #78 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #79 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #80 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #81 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #82 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #83 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #84 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #85 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #86 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #87 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #88 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #89 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #90 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #91 0x5653d5664195 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #92 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #93 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #94 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #95 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #96 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #97 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #98 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #99 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #100 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #101 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #102 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #103 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #104 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #105 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #106 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #107 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #108 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #109 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #110 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #111 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #112 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #113 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #114 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #115 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #116 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #117 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #118 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #119 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #120 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #121 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #122 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #123 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #124 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #125 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #126 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #127 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #128 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #129 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #130 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #131 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #132 0x5653d568e7fc in do_call_core /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5120
Apr 09 20:51:55     #133 0x5653d568e7fc in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3404
Apr 09 20:51:55     #134 0x5653d5663ff3 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #135 0x5653d566537b in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #136 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #137 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #138 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #139 0x5653d563c120 in slot_tp_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6207
Apr 09 20:51:55     #140 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #141 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #142 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #143 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #144 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #145 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #146 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #147 0x5653d5664c6a in _PyFunction_FastCall /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4933
Apr 09 20:51:55     #148 0x5653d5664c6a in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4968
Apr 09 20:51:55     #149 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #150 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #151 0x5653d566447a in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #152 0x5653d5665599 in _PyFunction_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:5084
Apr 09 20:51:55     #153 0x5653d55e38fe in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2310
Apr 09 20:51:55     #154 0x5653d55e8362 in _PyObject_Call_Prepend /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2373
Apr 09 20:51:55     #155 0x5653d55e333d in PyObject_Call /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2261
Apr 09 20:51:55     #156 0x5653d563b79a in slot_tp_init /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:6420
Apr 09 20:51:55     #157 0x5653d566ade6 in type_call /tmp/build/80754af9/python_1585002248360/work/Objects/typeobject.c:915
Apr 09 20:51:55     #158 0x5653d55e371a in _PyObject_FastCallDict /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2331
Apr 09 20:51:55     #159 0x5653d5665189 in _PyObject_FastCallKeywords /tmp/build/80754af9/python_1585002248360/work/Objects/abstract.c:2496
Apr 09 20:51:55     #160 0x5653d566abfd in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4875
Apr 09 20:51:55     #161 0x5653d568df59 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3351
Apr 09 20:51:55     #162 0x5653d5664332 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #163 0x5653d5664ea0 in fast_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4992
Apr 09 20:51:55     #164 0x5653d566ab84 in call_function /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4872
Apr 09 20:51:55     #165 0x5653d568d199 in _PyEval_EvalFrameDefault /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:3335
Apr 09 20:51:55     #166 0x5653d56659b8 in _PyEval_EvalCodeWithName /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4166
Apr 09 20:51:55     #167 0x5653d56659b8 in PyEval_EvalCodeEx /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:4187
Apr 09 20:51:55     #168 0x5653d566674b in PyEval_EvalCode /tmp/build/80754af9/python_1585002248360/work/Python/ceval.c:731
Apr 09 20:51:55     #169 0x5653d56e6633 in run_mod /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:1025
Apr 09 20:51:55     #170 0x5653d56e6a30 in PyRun_FileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:978
Apr 09 20:51:55     #171 0x5653d56e6c32 in PyRun_SimpleFileExFlags /tmp/build/80754af9/python_1585002248360/work/Python/pythonrun.c:419
Apr 09 20:51:55     #172 0x5653d56ea722 in run_file /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:340
Apr 09 20:51:55     #173 0x5653d56ea722 in Py_Main /tmp/build/80754af9/python_1585002248360/work/Modules/main.c:811
Apr 09 20:51:55     #174 0x5653d55b51fd in main /tmp/build/80754af9/python_1585002248360/work/Programs/python.c:69
Apr 09 20:51:55     #175 0x7f6411eb582f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
Apr 09 20:51:55     #176 0x5653d5693c29 in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103
Apr 09 20:51:55 
Apr 09 20:51:55 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/TH/generic/THTensorMoreMath.cpp:720:7 in 
Apr 09 20:51:55 Traceback (most recent call last):
Apr 09 20:51:55   File "test/run_test.py", line 696, in <module>
Apr 09 20:51:55     main()
Apr 09 20:51:55   File "test/run_test.py", line 689, in main
Apr 09 20:51:55     raise RuntimeError(message)
Apr 09 20:51:55 RuntimeError: quantization/test_quantized failed!

@vkuzo
Copy link
Contributor Author

vkuzo commented Apr 9, 2020

@ezyang, sorry about that, it was passing on my last run but I guess this is still flaky. Thanks for the info and I'll work on a fix.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in f813e71.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/15/head branch April 13, 2020 14:16
ashishfarmer pushed a commit to ashishfarmer/pytorch that referenced this pull request Apr 13, 2020
Summary:
Pull Request resolved: pytorch#35329

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Imported from OSS

Differential Revision: D20768930

fbshipit-source-id: ddf8727e9840c65ead3b890220af0638c5637028
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:
Pull Request resolved: #35329

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Imported from OSS

Differential Revision: D20768930

fbshipit-source-id: ddf8727e9840c65ead3b890220af0638c5637028
ghstack-source-id: db9c649
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

ghstack-source-id: 4149278
Pull Request resolved: #36593
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 14, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

ghstack-source-id: b9c2140
Pull Request resolved: #36593
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

ghstack-source-id: 0d681b6
Pull Request resolved: #36593
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

ghstack-source-id: 9bb87ea
Pull Request resolved: #36593
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Apr 15, 2020
Summary:

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Differential Revision: [D21030268](https://our.internmc.facebook.com/intern/diff/D21030268)

[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request Apr 16, 2020
Summary:
Pull Request resolved: #36593

This is a redo of #35329 with a
better test.

Adds a quantized implementation of LayerNorm for server.

A future PR will add the Python wrapper.

Test Plan:
numerics match the floating point implementation

benchmarks by input size:
v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13
v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2
v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b

Differential Revision: D21030268

Pulled By: vkuzo

fbshipit-source-id: b3594c3393cfce37a881319e2e0560620d51080f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: quantization Quantization support in PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants