BFloat16: enable prepacked weights's inference #48922

XiaobingSuper · 2020-12-07T05:10:25Z

Stack from ghstack:

enable mkldnn conv2d backward to support mkldnn tensor input #48994 enable mkldnn conv2d backward to support mkldnn tensor input
BFloat16: enable prepacked weights's inference #48922 BFloat16: enable prepacked weights's inference

Differential Revision: D25537188

[ghstack-poisoned]

ghstack-source-id: d41725d Pull Request resolved: #48922

dr-ci · 2020-12-07T06:12:06Z

💊 CI failures summary and remediations

As of commit 070f721 (more details on the Dr. CI page):

2/2 failures introduced in this PR

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_doc_test (1/2)

Step: "Doc test" (full log | diagnosis details | 🔁 rerun)

Feb 06 14:01:55 sccache: error: couldn't connect to server

Feb 06 14:01:55 ++++ eval 'extract_trap_cmd '
Feb 06 14:01:55 +++++ extract_trap_cmd
Feb 06 14:01:55 +++++ printf '%s\n' ''
Feb 06 14:01:55 ++++ printf '%s\n' cleanup
Feb 06 14:01:55 +++ trap -- '
Feb 06 14:01:55 cleanup' EXIT
Feb 06 14:01:55 +++ [[ pytorch-linux-xenial-py3.6-gcc5.4-build != *pytorch-win-* ]]
Feb 06 14:01:55 +++ which sccache
Feb 06 14:01:55 +++ sccache --stop-server
Feb 06 14:01:55 Stopping sccache server...
Feb 06 14:01:55 sccache: error: couldn't connect to server
Feb 06 14:01:55 sccache: caused by: Connection refused (os error 111)
Feb 06 14:01:55 +++ true
Feb 06 14:01:55 +++ rm /var/lib/jenkins/sccache_error.log
Feb 06 14:01:55 +++ [[ pytorch-linux-xenial-py3.6-gcc5.4-build == *rocm* ]]
Feb 06 14:01:55 +++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Feb 06 14:01:55 +++ SCCACHE_IDLE_TIMEOUT=1200
Feb 06 14:01:55 +++ RUST_LOG=sccache::server=error
Feb 06 14:01:55 +++ sccache --start-server
Feb 06 14:01:55 sccache: Starting the server...
Feb 06 14:01:55 +++ sccache --zero-stats

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Feb 06 16:47:02 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future

Feb 06 16:47:02 At:
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Feb 06 16:47:02 
Feb 06 16:47:02 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Feb 06 16:47:02 
Feb 06 16:47:02 At:
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Feb 06 16:47:02 
Feb 06 16:47:02 [E request_callback_no_python.cpp:653] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Feb 06 16:47:02 
Feb 06 16:47:02 At:
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Feb 06 16:47:02   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Feb 06 16:47:02 
Feb 06 16:47:02 ok (1.530s)
Feb 06 16:47:04   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... ok (1.530s)
Feb 06 16:47:05   test_return_local_rrefs (__main__.TensorPipeRpcTestWithSpawn) ... ok (1.532s)
Feb 06 16:47:11   test_rpc_profiling_async_function (__main__.TensorPipeRpcTestWithSpawn) ... ok (5.437s)
Feb 06 16:47:17   test_rpc_profiling_async_function_single_threaded (__main__.TensorPipeRpcTestWithSpawn) ... ok (5.939s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

[ghstack-poisoned]

XiaobingSuper · 2020-12-09T08:50:18Z

@VitalyFedyunin , the error is caused by runing old cpu device(only support avx2), but onednn need the device support avx512 at least for bfloat16 path, so I want to add a gloabal context like torch._C.has_mkldnn to skip bfloat16 test case if cpu device not support avx512,. is it ok for your side?

XiaobingSuper · 2020-12-10T04:55:45Z

@VitalyFedyunin, There has another option: fall back fp32 path in ideep if onednn is not supported on some old device, this option don't need to change pytorch code except updata ideep. if this option is ok, I will updata ideep at first. Thanks!

jgong5 · 2020-12-10T05:18:39Z

There has another option: fall back fp32 path in ideep if onednn is not supported on some old device, this option don't need to change pytorch code except updata ideep. if this option is ok, I will updata ideep at first. Thanks!

@VitalyFedyunin This option sounds better to us since it does not require extra CPU arch check from the PyTorch side. IDEEP will support BF16 regardless of CPU archs. We will have a quick upgrade on IDEEP if it looks good to you. :-)

wrong pr

VitalyFedyunin · 2020-12-14T18:29:23Z

Current error doesn't tell anything about avx support:

        x_mkldnn = x if x.is_mkldnn else x.to_mkldnn()
        y_mkldnn = torch._C._nn.mkldnn_linear(x_mkldnn, self.weight, self.bias)
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        y = y_mkldnn if x.is_mkldnn else y_mkldnn.to_dense()

Falling back to fp32 path will create invalid impression for users. It is much better to error out with nice warning when something is not supported.

For tests you don't need to introduce new global context. Something simple like @skipIf(test_arch(),"not supported architecture") will suffice.

jgong5 · 2020-12-15T00:10:19Z

Falling back to fp32 path will create invalid impression for users. It is much better to error out with nice warning when something is not supported.

@VitalyFedyunin Thanks for the suggestions. So, we will also add cpu arch check inside use_mkldnn (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Convolution.cpp#L227) so that it does not support bf16 without avx512. Does that make sense to you?

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

VitalyFedyunin · 2020-12-16T17:12:38Z

Ok we fixed tests, but can we please make sure that we are going to have nice error if someone tries it on old CPU

XiaobingSuper · 2021-01-13T05:51:03Z

Ok we fixed tests, but can we please make sure that we are going to have nice error if someone tries it on old CPU

Done.

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper · 2021-01-21T03:03:42Z

@VitalyFedyunin , please help review it. Thanks!

ngimel · 2021-01-24T22:53:44Z

test/test_mkldnn.py

+    try:
+        # for bf16 path, OneDNN requires the cpu has intel avx512 with avx512bw,
+        # avx512vl, and avx512dq.
+        cmd = "grep avx512bw /proc/cpuinfo | grep avx512vl | grep avx512dq"


did we give up on testing bf16 on windows? If so, tests should be marked to skip windows, instead of relying on obscure errors that would be thrown.

For windows, bf16 cases are skiped now, the reason is that I can't find a easy way to check the device as linux.

ngimel · 2021-01-24T22:59:03Z

torch/utils/mkldnn.py

        if isinstance(m, torch.nn.Linear):
-            return MkldnnLinear(m)
+            return MkldnnLinear(m, d)
        elif isinstance(m, torch.nn.Conv1d):


why is dtype argument ignored on Conv1d, Conv3d and BatchNorm? If they don't support it, we should error out. We cannot silently ignore it. Similarly, if dtype argument that's generally not supported is passed (e.g. torch.double) we should error out.

Dtype argument is added on Conv1d, Conv3d and will assert if given dtype is not supported.

ngimel · 2021-01-25T01:06:21Z

test/test_mkldnn.py

            self._test_serialization(mkldnn_conv2d, (x.to_mkldnn(),))
            self._test_tracing(mkldnn_conv2d, (x.to_mkldnn(),))

+    @unittest.skipIf(not has_bf16_support(), "OneDNN bfloat16 path requires AVX512")


these tests should not be skipped if there's no bf16 support, instead they should be catching and asserting that appropriate errors are raised.

Add test cases if there's no bf16 support.

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

ngimel

This PR would really benefit from better description and documentation, e.g. the fact that bfloat16 in mkldnn is supported only on avx512 processors really should be put in the description and in the Note somewhere in the code.
Similarly, I would like to understand why bias is never converted to bfloat16. If it is by design, it should be commented upon and put in a note.
Similarly, for BatchNorm parameters - if OneDNN wants them in fp32, that's fine, but there should be a comment about this.

test/test_mkldnn.py

torch/utils/mkldnn.py

ngimel · 2021-01-27T19:14:37Z

torch/utils/mkldnn.py

-            return MkldnnConv3d(m)
+            return MkldnnConv3d(m, d)
        elif isinstance(m, torch.nn.BatchNorm2d) or isinstance(m, torch.nn.BatchNorm3d):
            return MkldnnBatchNorm(m)


why is dtype ignored for batchnorm? Does it expect its parameters to remain in fp32?

Yes, batchnorm's parameters remain in fp32, for MKLDNN batchnorm bf16 path, it requires input is bf16, but parameters are fp32. but for conv and linear, weights require fp32, bias can be fp32 or bf16, for good accuracy, we chose fp32 for bias.

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper · 2021-01-28T03:15:55Z

This PR would really benefit from better description and documentation, e.g. the fact that bfloat16 in mkldnn is supported only on avx512 processors really should be put in the description and in the Note somewhere in the code.
Similarly, I would like to understand why bias is never converted to bfloat16. If it is by design, it should be commented upon and put in a note.
Similarly, for BatchNorm parameters - if OneDNN wants them in fp32, that's fine, but there should be a comment about this.

Yes, I add some comments in test_mkldnn.py and torch/utils/mkldnn.py to describe them.

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper · 2021-01-29T01:00:21Z

@ngimel , the failed case seems not related to this PR.

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

facebook-github-bot · 2021-02-17T19:22:23Z

@VitalyFedyunin merged this pull request in 324c6aa.

Summary: Pull Request resolved: pytorch#48922 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D25537188 Pulled By: VitalyFedyunin fbshipit-source-id: ab6eb1ba8cffb5ba9d00d05db8ef616628f8c932

BFloat16: enable prepacked weights's inference

eaf6b81

[ghstack-poisoned]

XiaobingSuper mentioned this pull request Dec 7, 2020

BFloat16: add explicit dtype support for to_mkldnn and to_dense #48881

Closed

facebook-github-bot added the cla signed label Dec 7, 2020

XiaobingSuper added a commit that referenced this pull request Dec 7, 2020

BFloat16: enable prepacked weights's inference

d0ac63b

ghstack-source-id: d41725d Pull Request resolved: #48922

XiaobingSuper requested a review from VitalyFedyunin December 7, 2020 05:11

pytorchbot added the open source label Dec 7, 2020

Update on "BFloat16: enable prepacked weights's inference"

42500c6

[ghstack-poisoned]

XiaobingSuper mentioned this pull request Dec 8, 2020

enable mkldnn conv2d backward to support mkldnn tensor input #48994

Closed

VitalyFedyunin previously approved these changes Dec 14, 2020

View reviewed changes

XiaobingSuper added 4 commits December 16, 2020 10:27

Update on "BFloat16: enable prepacked weights's inference"

f00d314

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

dc7bf3f

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

ee67847

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

f7020a7

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

57775ce

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper requested review from VitalyFedyunin and albanD January 18, 2021 01:54

Update on "BFloat16: enable prepacked weights's inference"

0b3dda2

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

VitalyFedyunin approved these changes Jan 21, 2021

View reviewed changes

ngimel reviewed Jan 24, 2021

View reviewed changes

ngimel reviewed Jan 25, 2021

View reviewed changes

Update on "BFloat16: enable prepacked weights's inference"

415194c

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper requested a review from ngimel January 27, 2021 03:08

Update on "BFloat16: enable prepacked weights's inference"

9bef894

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

ngimel reviewed Jan 27, 2021

View reviewed changes

Update on "BFloat16: enable prepacked weights's inference"

6fcdce8

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

8969fc8

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper requested a review from ngimel January 28, 2021 03:26

ngimel approved these changes Feb 1, 2021

View reviewed changes

Update on "BFloat16: enable prepacked weights's inference"

39a4f83

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

XiaobingSuper requested a review from VitalyFedyunin February 3, 2021 05:38

XiaobingSuper added 2 commits February 5, 2021 12:51

Update on "BFloat16: enable prepacked weights's inference"

8e76d9b

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

Update on "BFloat16: enable prepacked weights's inference"

070f721

Differential Revision: [D25537188](https://our.internmc.facebook.com/intern/diff/D25537188) [ghstack-poisoned]

facebook-github-bot closed this in 324c6aa Feb 17, 2021

facebook-github-bot added the Merged label Feb 17, 2021

facebook-github-bot deleted the gh/XiaobingSuper/3/head branch February 21, 2021 15:16

BFloat16: enable prepacked weights's inference #48922

BFloat16: enable prepacked weights's inference #48922

Uh oh!

Conversation

XiaobingSuper commented Dec 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Dec 7, 2020 • edited by facebook-github-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 2 new failures recognized by patterns

pytorch_doc_test (1/2)

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (2/2)

Uh oh!

XiaobingSuper commented Dec 9, 2020

Uh oh!

XiaobingSuper commented Dec 10, 2020

Uh oh!

jgong5 commented Dec 10, 2020

Uh oh!

VitalyFedyunin commented Dec 14, 2020

Uh oh!

jgong5 commented Dec 15, 2020

Uh oh!

VitalyFedyunin commented Dec 16, 2020

Uh oh!

XiaobingSuper commented Jan 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XiaobingSuper commented Jan 21, 2021

Uh oh!

ngimel Jan 24, 2021

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper Jan 27, 2021

Choose a reason for hiding this comment

Uh oh!

ngimel Jan 24, 2021

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper Jan 27, 2021

Choose a reason for hiding this comment

Uh oh!

ngimel Jan 25, 2021

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper Jan 27, 2021

Choose a reason for hiding this comment

Uh oh!

ngimel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngimel Jan 27, 2021

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper commented Jan 28, 2021

Uh oh!

XiaobingSuper commented Jan 29, 2021

Uh oh!

facebook-github-bot commented Feb 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

XiaobingSuper commented Dec 7, 2020 •

edited

Loading

dr-ci bot commented Dec 7, 2020 •

edited by facebook-github-bot

Loading

XiaobingSuper commented Jan 13, 2021 •

edited

Loading

XiaobingSuper Jan 28, 2021 •

edited

Loading