Skip to content

Conversation

@bddppq
Copy link
Contributor

@bddppq bddppq commented Apr 12, 2019

Stack:
    :white_circle:  #19633 Add is_mkldnn to at::Tensor  💚
    :white_circle:  #19204 Add aten mkldnn conv2d operator  💚
    :black_circle:  #19205 Add aten mkldnn ops: relu, max_pool2d and avg_pool2d  💚
    :white_circle:  #19206 Add aten mkldnn batch_norm operator  💚
    :white_circle:  #19207 Add aten mkldnn add operator  💚
    :white_circle:  #19209 Add aten mkldnn view operator  💚
    :white_circle:  #19210 Add aten mkldnn linear operator  💚
    :white_circle:  #19648 Adjust resnext run script  💚

Pull Request resolved: #19205

Differential Revision: D14850598

Differential Revision: D14850598
Differential Version: 79209389
- func: max_pool2d_with_indices(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> (Tensor, Tensor)
python_module: nn
dispatch:
CPU: max_pool2d_with_indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this just going to break derivatives for the same reason as before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope because the derivative of max_pool2d_with_indices has been defined in tools/autograd/derivatives.yaml

ideep::algorithm::pooling_max);
return std::make_tuple(
output,
// TODO: ideep currently doesn't expose the max indices,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm...would it work to just throw an exception if they ask for the indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to know whether indices is required or not in c++. We could throw in python side (where return_indices is an argument) but that would require exposing is_mkldnn to python side (not necessary a bad idea).
We could get the indices from mkldnn as well from a non-public api (@Jianhui-Li mentioned we can get it from the ideep::tensor get_extra)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@mingfeima mingfeima Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the problem here is that indice of mkldnn pooling is different from what PyTorch looks like, both the dtype and content.
    a. mkldnn - indice is a local view w.r.t. kernel, the dtype could be either int32_t or int8_t
    b. PyTorch - indice is a global view w.r.t. input.
    See mkldnn_issue_259 for detail.
    So directly returning mkldnn indice is meaningless. The mkldnn indice is more like a cache used by backward during training. Well in that case, it should be handled by workspace in ATen.
  2. at the current moment, probably the most simple way to address this is remove mkldnn path for mkldnn_max_pool2d_with_indices_out. Which means, only when indice is not in the output, use mkldnn; when indice is in the output, use THNN.
    This won't make any difference for models like ResNet or ResNext.
  3. in case we need to support unpooling (up-sampling) where indice is needed, we can convert the mkldnn indice to global by doing the math pointed in the link above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq Due to the differences between PyTorch indices tensor (using global indices) and MKL-DNN indices tensor (using local indices), we are considering to hide the MKL-DNN indices inside ideep as an "opaque" format. Users can get PyTorch indices with to_dense(). This may take a while and before that happens, how about we only implement max_pool2d without indices in MKL-DNN path as @mingfeima proposed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aten max_pool2d right now doesn’t have dispatch (and is not even exposed to python), so in order to do that we will need to add derivatives for it to let codegen do the unwrap thing for it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you just want unwrap - there's 'require_tensor: True' annotation you can put in native_functions.yaml

As for exposing to python, I think if you make it as 'python_module: nn' or somethingl ike this, it'd be exposed only as torch._C._nn - I think it's ok to have it but not call it from functional wrappers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's explicitly not exposed to python: https://github.com/pytorch/pytorch/blob/1c5073f/tools/autograd/gen_python_functions.py#L33
I don't dare to open this can of worms (actually I did at the very beginning and then ran into a test failure complaining derivative for max_pool2d is not implemented (or similar)).

Since

  1. indices are only used in training
  2. mkldnn/ideep will add support to expose indices in their api

Let's just return some garbage values and only support inference for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think it should have worked if we were to check requires_grad. But I don't insist.

Can you please expand this comment with more details on why garbage is fine? (pretty much what we have discussed above)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @wanchaol is working on exposing max_pool2d to python #19449. I will give a try to implement mkldnn for max_pool2d once that is merged

@bddppq bddppq added the module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration label Apr 12, 2019
bddppq added 3 commits April 12, 2019 21:12
Differential Revision: D14850598
Differential Version: 79296619
Differential Revision: D14850598
Differential Version: 79296710
Differential Revision: D14850598
Differential Version: 79299275
ideep::algorithm::pooling_max);
return std::make_tuple(
output,
// TODO: ideep currently doesn't expose the max indices,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MkldnnCPU: mkldnn_max_pool2d_with_indices_out

# Return: (Tensor output, Tensor indices)
- func: max_pool2d_with_indices(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> (Tensor, Tensor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: There is also max_pool2d that doesn't need to return indices, useful for inference workload. We can implement it more efficiently with MKL-DNN by setting the prop_kind to forward_inference.

ceil_mode
);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MKLDNN treats padding differently with ceil mode.
e.g. the following case would result runtime error, both max pooling and avg pooling:

import torch
x = torch.randn(1, 1, 10, 10, dtype=torch.float32) * 10
max_pool2d = torch.nn.MaxPool2d(kernel_size=(4,5),stride=(3,4),padding=(1,2), ceil_mode=True)
# max_pool2d = torch.nn.AvgPool2d(kernel_size=(4,5),stride=(3,4),padding=(1,2), ceil_mode=True)
output =  max_pool2d(x.to_mkldnn()).to_dense()

please refer caffe code of computing pooling output size for detail. @XiaobingSuper

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this comment been addressed? (sorry I didn't follow the point)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that mkldnn is to get given output size by changing padding size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an assert to only support non ceil_mode for now. Let's do the padding trick in a follow up PR.

bddppq added 3 commits April 16, 2019 15:11
Differential Revision: D14850598
Differential Version: 79689942
Differential Revision: D14850598
Differential Version: 79698402
Differential Revision: D14850598
Differential Version: 79729546
Copy link
Collaborator

@dzhulgakov dzhulgakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as long as the sizes calculation comment has been addressed

ceil_mode
);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this comment been addressed? (sorry I didn't follow the point)

ideep::algorithm::pooling_max);
return std::make_tuple(
output,
// TODO: ideep currently doesn't expose the max indices,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think it should have worked if we were to check requires_grad. But I don't insist.

Can you please expand this comment with more details on why garbage is fine? (pretty much what we have discussed above)

bddppq added 2 commits April 22, 2019 13:23
Differential Revision: D14850598
Differential Version: 80386616
Differential Revision: D14850598
Differential Version: 80533693
bddppq added 3 commits April 23, 2019 14:22
Differential Revision: D14850598
Differential Version: 80534912
Differential Revision: D14850598
Differential Version: 80541664
Differential Revision: D14850598
Differential Version: 80560023
bddppq added 2 commits April 23, 2019 21:42
Differential Revision: D14850598
Differential Version: 80580144
Differential Revision: D14850598
Differential Version: 80683622
Tensor new_with_itensor_mkldnn(ideep::tensor&& it, const TensorOptions& options) {
// NOTE: int32_t dims from ideep::tensor but sizes needs int64_t
// TODO: support int64_t dims in ideep::tensor to avoid extra conversion
AT_ASSERT(!it.has_extra());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we originally have it here?

IntArrayRef padding,
bool ceil_mode,
bool count_include_pad) {
output = _mkldnn_pool2d(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I think it's wrong. _out variants are supposed to assign things in-place in the same Tensor instance, no? @gchanan to double-check

and I'd add a test for it too

Copy link
Contributor Author

@bddppq bddppq Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a test case, but IIRC the _out version doesn't require same instance (the _ version does).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right _out requires same identity. We will need to change opaque tensor impl to allow setting sizes to support this. In this PR let's not support in-place avg_pool for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually _out versions just fail if the size is wrong. But I agree that we can just skip _out version here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh so normally the output tensor got passed in is supposed to already have the right sizes? (and other meta info like dtype)

- func: max_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> Tensor

- func: mkldnn_max_pool2d(Tensor self, int[2] kernel_size, int[2] stride=[], int[2] padding=0, int[2] dilation=1, bool ceil_mode=False) -> Tensor
requires_tensor: True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires_tensor is probably not required here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep just for documentation purpose

bddppq added 2 commits April 25, 2019 18:20
Differential Revision: D14850598
Differential Version: 80773731
Differential Revision: D14850598
Differential Version: 80780046
bddppq added 2 commits April 26, 2019 01:03
Differential Revision: D14850598
Differential Version: 80785119
Differential Revision: D14850598
Differential Version: 80799727
zdevito pushed a commit to zdevito/ATen that referenced this pull request Apr 26, 2019
Summary: Pull Request resolved: pytorch/pytorch#19205

Reviewed By: dzhulgakov

Differential Revision: D14850598

fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 4864000.

zhangguanheng66 pushed a commit to zhangguanheng66/pytorch that referenced this pull request May 6, 2019
Summary: Pull Request resolved: pytorch#19205

Reviewed By: dzhulgakov

Differential Revision: D14850598

fbshipit-source-id: 5bbd5909c06df9c980de680ffb81bf772766c0ba
@ezyang ezyang deleted the export-D14850598 branch May 30, 2019 15:55
facebook-github-bot pushed a commit that referenced this pull request Jun 6, 2019
Summary:
Modify MKLDNN pooling operation to support ceil mode by adjusting the right/bottom padding accordingly. This is done similarly as in Caffe (see discussion #19205 (comment)).

To make this possible, I split the padding to left and right (top / bottom). This naming is confusing but actually follows mkldnn's own naming for pooling::compute(). We increase the r paddings so that it matches the ceiling mode expected output size.

Strengthened the test case.
Pull Request resolved: #21310

Reviewed By: bddppq

Differential Revision: D15611664

Pulled By: akyrola

fbshipit-source-id: 46b40015dafef69a8fd5e7b2c261d8dbf448cd20
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jun 6, 2019
Summary:
Modify MKLDNN pooling operation to support ceil mode by adjusting the right/bottom padding accordingly. This is done similarly as in Caffe (see discussion pytorch/pytorch#19205 (comment)).

To make this possible, I split the padding to left and right (top / bottom). This naming is confusing but actually follows mkldnn's own naming for pooling::compute(). We increase the r paddings so that it matches the ceiling mode expected output size.

Strengthened the test case.
Pull Request resolved: pytorch/pytorch#21310

Reviewed By: bddppq

Differential Revision: D15611664

Pulled By: akyrola

fbshipit-source-id: 46b40015dafef69a8fd5e7b2c261d8dbf448cd20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: mkldnn Related to Intel IDEEP or oneDNN (a.k.a. mkldnn) integration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants