upgrade mkldnn-bridge #20569

gujinghui · 2019-05-16T05:26:10Z

reduce the overhead of mkldnn-bridge itself
remove redundant code and useless APIs
provide new operators, including int8 inner_product, ND permute/transpose, elem_add/mul, and etc.
improve inner_product to support io format weights without implicit reorder
add SoftMax support

gujinghui · 2019-05-16T09:49:41Z

gujinghui · 2019-05-19T16:09:11Z

The failure is not related to this change.

May 17 10:59:12 ======================================================================
May 17 10:59:12 FAIL: test_HalfTensor_tril_zero_stride (main.TestCuda)
May 17 10:59:12 ----------------------------------------------------------------------
May 17 10:59:12 Traceback (most recent call last):
May 17 10:59:12 File "/var/lib/jenkins/workspace/test/common_utils.py", line 338, in wrapper
May 17 10:59:12 method(*args, **kwargs)
May 17 10:59:12 File "test_cuda.py", line 648, in tmp
May 17 10:59:12 self.assertEqual(cpu_tensor, gpu_tensor, precision)
May 17 10:59:12 File "/var/lib/jenkins/workspace/test/common_utils.py", line 483, in assertEqual
May 17 10:59:12 assertTensorsEqual(x, y)
May 17 10:59:12 File "/var/lib/jenkins/workspace/test/common_utils.py", line 475, in assertTensorsEqual
May 17 10:59:12 self.assertLessEqual(max_err, prec, message)
May 17 10:59:12 AssertionError: tensor(nan, dtype=torch.float32) not less than or equal to 1e-05 :
May 17 10:59:12

gujinghui · 2019-05-19T16:10:16Z

@yinghai
Could you help review this PR.

Thanks a lot.

yinghai · 2019-05-21T02:59:03Z

We need to update internal ideep before accepting.

gujinghui · 2019-05-27T08:27:56Z

@yinghai
Could you help review this PR?
The MKLDNN is upgraded to v0.19.

Thanks a lot.

gujinghui · 2019-05-28T06:30:19Z

rebased to latest code base

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

bddppq · 2019-06-03T19:48:29Z

We have run into the this error when compiling mkl-dnn v0.19 from source:

cpu/ref_softmax.cpp:171:25: error: 'VML_ERRMODE_NOERR' was not declared in this scope
         vmsExp(n, a, r, VML_ERRMODE_NOERR);
                         ^~~~~~~~~~~~~~~~~

gujinghui · 2019-06-04T01:00:16Z

@bddppq looks like the mkl version is diff. could you help check which version is your mkl on your machine?

bddppq · 2019-06-04T01:06:01Z

@gujinghui it's 2017.0.098

gujinghui · 2019-06-04T10:05:40Z

@bddppq @VitalyFedyunin

Should be fixed. Pls try again. Thanks.

bddppq · 2019-06-04T19:03:20Z

@gujinghui Where did you put the fix? The release tag v0.19 in mkl-dnn repo is still pointing to the same commit.

yinghai · 2019-06-04T19:24:57Z

#define INTEL_MKL_VERSION 20170003

gujinghui · 2019-06-05T05:21:15Z

cmake/Modules/FindMKLDNN.cmake

@bddppq
disabled this flag for MKLDNN, which caused the build failure.
This flag didn't impact on pytorch any more.

Does disabling MKL have perf implications on MKLDNN?

according to below table from MKLDNN, should be no perf change.
Anyway, we use jit impl.

/* USE_MKL USE_CBLAS effect * ------- --------- ------ * yes yes use Intel(R) MKL CBLAS * yes no use jit * no yes system-dependent CBLAS * no no use jit */

gujinghui · 2019-06-10T08:53:26Z

@bddppq @yinghai
Move softmax into mkldnn-bridge. Pls help review.

gujinghui · 2019-06-10T14:54:06Z

@pytorchbot rebase this please

facebook-github-bot

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

bddppq · 2019-06-11T03:15:27Z

@gujinghui We have built mkldnn v0.19 from source (with -DMKLDNN_USE_MKL=OFF), and have seen noticeable perf drop when running resnext101 inference run. I think a better "fix" to the issue would be adding mkl version detection in mkldnn and fallback to old api call in environment with older version of mkl.

1. reduce the overhead of mkldnn-bridge itself 2. remove redundant code and useless APIs 3. provide new operators, including int8 inner_product, nD permute/transpose, ele_add/mul, and etc. 4. improve inner_product to support io format weights without implicit reorder 5. add softmax support Signed-off-by: Gu, Jinghui <[email protected]>

gujinghui · 2019-06-11T05:17:31Z

@bddppq
The MKLDNN v0.19 is reverted. And the USE_MKL flag is set as before.
Only ideep update in this PR now. Pls help review.

Per compatibility issue bwt v0.19 and old MKL, we're working on it now.
Will return to you if any update.

Thanks a lot.

bddppq · 2019-06-11T05:23:18Z

@gujinghui Thanks!
Thought currently in this PR the mkl-dnn submodule (within the ideep submodule) is still updated.

gujinghui · 2019-06-11T05:26:15Z

@gujinghui Thanks!
Thought currently in this PR the mkl-dnn submodule (within the ideep submodule) is still updated.

Yes. The trivial change is seen in ideep in this PR.
But the commit ID of mkl-dnn should be identical as before.

facebook-github-bot

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: 1. reduce the overhead of mkldnn-bridge itself 2. remove redundant code and useless APIs 3. provide new operators, including int8 inner_product, ND permute/transpose, elem_add/mul, and etc. 4. improve inner_product to support io format weights without implicit reorder 5. add SoftMax support Pull Request resolved: pytorch/pytorch#20569 Reviewed By: houseroad Differential Revision: D15558663 Pulled By: bddppq fbshipit-source-id: 79a63aa139037924e9ffb1069f7e7f1d334efe3a

facebook-github-bot · 2019-06-11T10:03:11Z

@bddppq merged this pull request in 731670f.

pytorchbot added the module: third_party label May 16, 2019

This was referenced May 16, 2019

Implementation of MatMul with bias operator for MKL-DNN #20136

Closed

implement transpose operator for MKLDNN #19955

Closed

gujinghui force-pushed the upgrade_bridge branch from fbeba9e to 886a5b3 Compare May 16, 2019 06:50

pytorchbot added caffe2 module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration module: operators labels May 16, 2019

XiaobingSuper mentioned this pull request May 16, 2019

Add mkldnn mul operator #20575

Closed

6 tasks

gujinghui force-pushed the upgrade_bridge branch from 886a5b3 to b4e2276 Compare May 17, 2019 09:23

gujinghui force-pushed the upgrade_bridge branch from b4e2276 to c588127 Compare May 27, 2019 08:26

gujinghui changed the title ~~upgrade mkldnn-bridge~~ upgrade MKLDNN to v0.19 and mkldnn-bridge May 27, 2019

gujinghui force-pushed the upgrade_bridge branch 2 times, most recently from 7fec349 to be814f0 Compare May 28, 2019 06:29

facebook-github-bot reviewed May 30, 2019

View reviewed changes

gujinghui force-pushed the upgrade_bridge branch from be814f0 to f47c1ce Compare June 4, 2019 10:04

pytorchbot added the module: build Build system issues label Jun 4, 2019

gujinghui commented Jun 5, 2019

View reviewed changes

ezyang added the open source label Jun 5, 2019

gujinghui force-pushed the upgrade_bridge branch from f47c1ce to a453a3c Compare June 10, 2019 08:50

gujinghui mentioned this pull request Jun 10, 2019

Add mkldnn softmax operator #21516

Closed

facebook-github-bot reviewed Jun 10, 2019

View reviewed changes

jeffreyksmithjr added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 10, 2019

gujinghui force-pushed the upgrade_bridge branch from d826e46 to bc1d9ee Compare June 11, 2019 05:11

gujinghui changed the title ~~upgrade MKLDNN to v0.19 and mkldnn-bridge~~ upgrade mkldnn-bridge Jun 11, 2019

facebook-github-bot reviewed Jun 11, 2019

View reviewed changes

bddppq approved these changes Jun 11, 2019

View reviewed changes

facebook-github-bot closed this in 731670f Jun 11, 2019

facebook-github-bot added the merged label Jun 11, 2019

mruberry added the Merged label Oct 28, 2020

upgrade mkldnn-bridge #20569

upgrade mkldnn-bridge #20569

Uh oh!

Conversation

gujinghui commented May 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gujinghui commented May 16, 2019

Uh oh!

gujinghui commented May 19, 2019

Uh oh!

gujinghui commented May 19, 2019

Uh oh!

yinghai commented May 21, 2019

Uh oh!

gujinghui commented May 27, 2019

Uh oh!

gujinghui commented May 28, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

bddppq commented Jun 3, 2019

Uh oh!

gujinghui commented Jun 4, 2019

Uh oh!

bddppq commented Jun 4, 2019

Uh oh!

gujinghui commented Jun 4, 2019

Uh oh!

bddppq commented Jun 4, 2019

Uh oh!

yinghai commented Jun 4, 2019

Uh oh!

gujinghui Jun 5, 2019

Choose a reason for hiding this comment

Uh oh!

bddppq Jun 5, 2019

Choose a reason for hiding this comment

Uh oh!

gujinghui Jun 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gujinghui commented Jun 10, 2019

Uh oh!

gujinghui commented Jun 10, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

bddppq commented Jun 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gujinghui commented Jun 11, 2019

Uh oh!

bddppq commented Jun 11, 2019

Uh oh!

gujinghui commented Jun 11, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

gujinghui commented May 16, 2019 •

edited

Loading

gujinghui Jun 5, 2019 •

edited

Loading

bddppq commented Jun 11, 2019 •

edited

Loading