Add aten mkldnn ops: relu, max_pool2d and avg_pool2d #19205

bddppq · 2019-04-12T18:40:28Z

Stack:
    :white_circle: #19633 Add is_mkldnn to at::Tensor  💚
    :white_circle: #19204 Add aten mkldnn conv2d operator  💚
    :black_circle: #19205 Add aten mkldnn ops: relu, max_pool2d and avg_pool2d  💚
    :white_circle: #19206 Add aten mkldnn batch_norm operator  💚
    :white_circle: #19207 Add aten mkldnn add operator  💚
    :white_circle: #19209 Add aten mkldnn view operator  💚
    :white_circle: #19210 Add aten mkldnn linear operator  💚
    :white_circle: #19648 Adjust resnext run script  💚

Pull Request resolved: #19205

Differential Revision: D14850598

Differential Revision: D14850598 Differential Version: 79209389

torch/nn/modules/activation.py

gchanan · 2019-04-12T20:57:59Z

aten/src/ATen/native/native_functions.yaml

 - func: max_pool2d_with_indices(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> (Tensor, Tensor)
  python_module: nn
+  dispatch:
+    CPU: max_pool2d_with_indices


isn't this just going to break derivatives for the same reason as before?

nope because the derivative of max_pool2d_with_indices has been defined in tools/autograd/derivatives.yaml

aten/src/ATen/native/mkldnn/Pooling.cpp

gchanan · 2019-04-12T21:00:08Z

aten/src/ATen/native/mkldnn/Pooling.cpp

+      ideep::algorithm::pooling_max);
+  return std::make_tuple(
+      output,
+      // TODO: ideep currently doesn't expose the max indices,


umm...would it work to just throw an exception if they ask for the indices?

There is no way to know whether indices is required or not in c++. We could throw in python side (where return_indices is an argument) but that would require exposing is_mkldnn to python side (not necessary a bad idea).
We could get the indices from mkldnn as well from a non-public api (@Jianhui-Li mentioned we can get it from the ideep::tensor get_extra)

Yes, we can use get_extra of the output ideep tensor. See this:
https://github.com/intel/ideep/blob/960ee3a20d975655d3fa8e40e445898a1e3b4078/include/ideep/computations.hpp#L3882

the problem here is that indice of mkldnn pooling is different from what PyTorch looks like, both the dtype and content.
a. mkldnn - indice is a local view w.r.t. kernel, the dtype could be either int32_t or int8_t
b. PyTorch - indice is a global view w.r.t. input.
See mkldnn_issue_259 for detail.
So directly returning mkldnn indice is meaningless. The mkldnn indice is more like a cache used by backward during training. Well in that case, it should be handled by workspace in ATen.

at the current moment, probably the most simple way to address this is remove mkldnn path for mkldnn_max_pool2d_with_indices_out. Which means, only when indice is not in the output, use mkldnn; when indice is in the output, use THNN.
This won't make any difference for models like ResNet or ResNext.

in case we need to support unpooling (up-sampling) where indice is needed, we can convert the mkldnn indice to global by doing the math pointed in the link above.

@bddppq Due to the differences between PyTorch indices tensor (using global indices) and MKL-DNN indices tensor (using local indices), we are considering to hide the MKL-DNN indices inside ideep as an "opaque" format. Users can get PyTorch indices with to_dense(). This may take a while and before that happens, how about we only implement max_pool2d without indices in MKL-DNN path as @mingfeima proposed?

aten max_pool2d right now doesn’t have dispatch (and is not even exposed to python), so in order to do that we will need to add derivatives for it to let codegen do the unwrap thing for it.

If you just want unwrap - there's 'require_tensor: True' annotation you can put in native_functions.yaml

As for exposing to python, I think if you make it as 'python_module: nn' or somethingl ike this, it'd be exposed only as torch._C._nn - I think it's ok to have it but not call it from functional wrappers

It's explicitly not exposed to python: https://github.com/pytorch/pytorch/blob/1c5073f/tools/autograd/gen_python_functions.py#L33
I don't dare to open this can of worms (actually I did at the very beginning and then ran into a test failure complaining derivative for max_pool2d is not implemented (or similar)).

Since

indices are only used in training

mkldnn/ideep will add support to expose indices in their api

Let's just return some garbage values and only support inference for now.

Hm, I think it should have worked if we were to check requires_grad. But I don't insist.

Can you please expand this comment with more details on why garbage is fine? (pretty much what we have discussed above)

Actually @wanchaol is working on exposing max_pool2d to python #19449. I will give a try to implement mkldnn for max_pool2d once that is merged

Differential Revision: D14850598 Differential Version: 79296619

Differential Revision: D14850598 Differential Version: 79296710

Differential Revision: D14850598 Differential Version: 79299275

jgong5 · 2019-04-14T05:44:11Z

aten/src/ATen/native/mkldnn/Pooling.cpp

+      ideep::algorithm::pooling_max);
+  return std::make_tuple(
+      output,
+      // TODO: ideep currently doesn't expose the max indices,


Yes, we can use get_extra of the output ideep tensor. See this:
https://github.com/intel/ideep/blob/960ee3a20d975655d3fa8e40e445898a1e3b4078/include/ideep/computations.hpp#L3882

jgong5 · 2019-04-14T05:52:06Z

aten/src/ATen/native/native_functions.yaml

+    MkldnnCPU: mkldnn_max_pool2d_with_indices_out

 # Return: (Tensor output, Tensor indices)
 - func: max_pool2d_with_indices(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> (Tensor, Tensor)


BTW: There is also max_pool2d that doesn't need to return indices, useful for inference workload. We can implement it more efficiently with MKL-DNN by setting the prop_kind to forward_inference.

mingfeima · 2019-04-15T06:01:32Z

aten/src/ATen/native/mkldnn/Utils.cpp

+      ceil_mode
+    );
+  }
+


MKLDNN treats padding differently with ceil mode.
e.g. the following case would result runtime error, both max pooling and avg pooling:

import torch x = torch.randn(1, 1, 10, 10, dtype=torch.float32) * 10 max_pool2d = torch.nn.MaxPool2d(kernel_size=(4,5),stride=(3,4),padding=(1,2), ceil_mode=True) # max_pool2d = torch.nn.AvgPool2d(kernel_size=(4,5),stride=(3,4),padding=(1,2), ceil_mode=True) output = max_pool2d(x.to_mkldnn()).to_dense()

please refer caffe code of computing pooling output size for detail. @XiaobingSuper

has this comment been addressed? (sorry I didn't follow the point)

This problem can be solved by change the mkldnn padding as caffe's doing: https://github.com/intel/caffe/blob/362a3b375e8f5bc64f5338d18f1742448cfb8fb7/src/caffe/layers/mkldnn_pooling_layer.cpp#L148

Please note that mkldnn is to get given output size by changing padding size.

Added an assert to only support non ceil_mode for now. Let's do the padding trick in a follow up PR.

Differential Revision: D14850598 Differential Version: 79689942

Differential Revision: D14850598 Differential Version: 79698402

Differential Revision: D14850598 Differential Version: 79729546

dzhulgakov

Looks good as long as the sizes calculation comment has been addressed

dzhulgakov · 2019-04-19T03:33:40Z

aten/src/ATen/native/mkldnn/Utils.cpp

+      ceil_mode
+    );
+  }
+


has this comment been addressed? (sorry I didn't follow the point)

dzhulgakov · 2019-04-19T03:35:50Z

aten/src/ATen/native/mkldnn/Pooling.cpp

+      ideep::algorithm::pooling_max);
+  return std::make_tuple(
+      output,
+      // TODO: ideep currently doesn't expose the max indices,


Hm, I think it should have worked if we were to check requires_grad. But I don't insist.

Can you please expand this comment with more details on why garbage is fine? (pretty much what we have discussed above)

Differential Revision: D14850598 Differential Version: 80386616

Differential Revision: D14850598 Differential Version: 80533693

Differential Revision: D14850598 Differential Version: 80534912

Differential Revision: D14850598 Differential Version: 80541664

Differential Revision: D14850598 Differential Version: 80560023

Differential Revision: D14850598 Differential Version: 80580144

Differential Revision: D14850598 Differential Version: 80683622

dzhulgakov · 2019-04-25T02:02:55Z

aten/src/ATen/native/mkldnn/MKLDNNCommon.cpp

 Tensor new_with_itensor_mkldnn(ideep::tensor&& it, const TensorOptions& options) {
  // NOTE: int32_t dims from ideep::tensor but sizes needs int64_t
  // TODO: support int64_t dims in ideep::tensor to avoid extra conversion
-  AT_ASSERT(!it.has_extra());


why did we originally have it here?

dzhulgakov · 2019-04-25T02:05:15Z

aten/src/ATen/native/mkldnn/Pooling.cpp

+    IntArrayRef padding,
+    bool ceil_mode,
+    bool count_include_pad) {
+  output = _mkldnn_pool2d(


hm, I think it's wrong. _out variants are supposed to assign things in-place in the same Tensor instance, no? @gchanan to double-check

and I'd add a test for it too

I will add a test case, but IIRC the _out version doesn't require same instance (the _ version does).

You are right _out requires same identity. We will need to change opaque tensor impl to allow setting sizes to support this. In this PR let's not support in-place avg_pool for now.

Usually _out versions just fail if the size is wrong. But I agree that we can just skip _out version here

Oh so normally the output tensor got passed in is supposed to already have the right sizes? (and other meta info like dtype)

dzhulgakov · 2019-04-25T02:06:00Z

aten/src/ATen/native/native_functions.yaml

 - func: max_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> Tensor

+- func: mkldnn_max_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> Tensor
+  requires_tensor: True


requires_tensor is probably not required here

yep just for documentation purpose

Differential Revision: D14850598 Differential Version: 80773731

Differential Revision: D14850598 Differential Version: 80780046

Differential Revision: D14850598 Differential Version: 80785119

Differential Revision: D14850598 Differential Version: 80799727

Summary: Pull Request resolved: pytorch/pytorch#19205 Reviewed By: dzhulgakov Differential Revision: D14850598 fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba

facebook-github-bot · 2019-04-27T01:09:55Z

This pull request has been merged in 4864000.

Summary: Pull Request resolved: pytorch#19205 Reviewed By: dzhulgakov Differential Revision: D14850598 fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba

Summary: Modify MKLDNN pooling operation to support ceil mode by adjusting the right/bottom padding accordingly. This is done similarly as in Caffe (see discussion #19205 (comment)). To make this possible, I split the padding to left and right (top / bottom). This naming is confusing but actually follows mkldnn's own naming for pooling::compute(). We increase the r paddings so that it matches the ceiling mode expected output size. Strengthened the test case. Pull Request resolved: #21310 Reviewed By: bddppq Differential Revision: D15611664 Pulled By: akyrola fbshipit-source-id: 46b40015dafef69a8fd5e7b2c261d8dbf448cd20

Summary: Modify MKLDNN pooling operation to support ceil mode by adjusting the right/bottom padding accordingly. This is done similarly as in Caffe (see discussion pytorch/pytorch#19205 (comment)). To make this possible, I split the padding to left and right (top / bottom). This naming is confusing but actually follows mkldnn's own naming for pooling::compute(). We increase the r paddings so that it matches the ceiling mode expected output size. Strengthened the test case. Pull Request resolved: pytorch/pytorch#21310 Reviewed By: bddppq Differential Revision: D15611664 Pulled By: akyrola fbshipit-source-id: 46b40015dafef69a8fd5e7b2c261d8dbf448cd20

V9: Initial commit

e028e18

Differential Revision: D14850598 Differential Version: 79209389

This was referenced Apr 12, 2019

Add aten mkldnn conv2d operator #19204

Closed

Add aten mkldnn batch_norm operator #19206

Closed

Add aten mkldnn add operator #19207

Closed

Add aten mkldnn view operator #19209

Closed

Add aten mkldnn linear operator #19210

Closed

gchanan requested changes Apr 12, 2019

View reviewed changes

bddppq added the module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration label Apr 12, 2019

bddppq added 3 commits April 12, 2019 21:12

V10: Merge with parent diff changes

d92a03a

Differential Revision: D14850598 Differential Version: 79296619

V11: (no description)

b2dd846

Differential Revision: D14850598 Differential Version: 79296710

V12: Merge with parent diff changes

2e35227

Differential Revision: D14850598 Differential Version: 79299275

jgong5 reviewed Apr 14, 2019

View reviewed changes

mingfeima reviewed Apr 15, 2019

View reviewed changes

bddppq added 3 commits April 16, 2019 15:11

V13: Merge with parent diff changes

852490f

Differential Revision: D14850598 Differential Version: 79689942

V14: Merge with parent diff changes

89621b2

Differential Revision: D14850598 Differential Version: 79698402

V15: (no description)

96f8ece

Differential Revision: D14850598 Differential Version: 79729546

dzhulgakov approved these changes Apr 19, 2019

View reviewed changes

bddppq added 2 commits April 22, 2019 13:23

V16: Merge with parent diff changes

4d3f3f7

Differential Revision: D14850598 Differential Version: 80386616

V18: Merge with parent diff changes

8b80ed8

Differential Revision: D14850598 Differential Version: 80533693

bddppq mentioned this pull request Apr 23, 2019

Add is_mkldnn to at::Tensor #19633

Closed

bddppq added 3 commits April 23, 2019 14:22

V19: Merge with parent diff changes

3da3fcb

Differential Revision: D14850598 Differential Version: 80534912

V20: Merge with parent diff changes

49ce657

Differential Revision: D14850598 Differential Version: 80541664

V21: (no description)

fd9690c

Differential Revision: D14850598 Differential Version: 80560023

bddppq mentioned this pull request Apr 24, 2019

Adjust resnext run script #19648

Closed

bddppq added 2 commits April 23, 2019 21:42

V22: (no description)

8b4c8f0

Differential Revision: D14850598 Differential Version: 80580144

V23: Merge with parent diff changes

53651c8

Differential Revision: D14850598 Differential Version: 80683622

dzhulgakov reviewed Apr 25, 2019

View reviewed changes

bddppq added 2 commits April 25, 2019 18:20

V24: Merge with parent diff changes

8103ad0

Differential Revision: D14850598 Differential Version: 80773731

V26: (no description)

03dd40c

Differential Revision: D14850598 Differential Version: 80780046

bddppq mentioned this pull request Apr 26, 2019

Support ceil mode in mkldnn {max,avg}_pool #19796

Closed

bddppq added 2 commits April 26, 2019 01:03

V27: Merge with parent diff changes

2bed92f

Differential Revision: D14850598 Differential Version: 80785119

V28: Merge with parent diff changes

3bf9ad0

Differential Revision: D14850598 Differential Version: 80799727

facebook-github-bot closed this in 4864000 Apr 26, 2019

facebook-github-bot added the merged label Apr 27, 2019

ezyang deleted the export-D14850598 branch May 30, 2019 15:55

akyrola mentioned this pull request Jun 3, 2019

[mkldnn][pooling] support ceil mode by padding changes #21310

Closed

Add aten mkldnn ops: relu, max_pool2d and avg_pool2d #19205

Add aten mkldnn ops: relu, max_pool2d and avg_pool2d #19205

Uh oh!

Conversation

bddppq commented Apr 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mingfeima Apr 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dzhulgakov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bddppq Apr 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

bddppq commented Apr 12, 2019 •

edited

Loading

mingfeima Apr 15, 2019 •

edited

Loading

bddppq Apr 25, 2019 •

edited

Loading