Added "same" padding for conv*d #22484

Chillee · 2019-07-03T07:20:28Z

The implementation is taken from here (with some comments annotating the logic): https://github.com/mlperf/inference/blob/master/others/edge/object_detection/ssd_mobilenet/pytorch/utils.py#L40

In addition, I've stress tested it for all combinations of input image, kernel size, stride, and dilation from 1-6 (6^8 combinations). For all of them, the output dimensions follow the rules listed here for "same": https://www.tensorflow.org/api_docs/python/tf/nn/convolution

output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])

~~I still need to add tests and test JIT compatability.~~ Then I can tackle the other ones.

fmassa

From my view, I think we should have 1:1 parity between the functional and the module interface.

This means that we should probably push the "same" handling down to the functional interface, but I'll let the others chime in as well.

torch/nn/modules/conv.py

Chillee · 2019-07-06T23:14:02Z

@fmassa I initially did it this way as it seems that Soumith implied that it should be separate from the functional interface: #3867 (comment)

I continued doing it this way because it seemed like there already were differences between the functional and module interfaces. For example, the module interface supports an argument padding_mode that can be circular or zeros. I assumed that things were done this way as circular padding also requires a separate F.pad before the conv is called.

Putting it in the functional interface also requires delving substantially deeper into the darker sides of Pytorch, and it wasn't clear to me where the appropriate location was, nor whether it would break something to have one op actually become 2 ops.

apaszke · 2019-07-08T12:35:00Z

Uhhh are we sure we want this? I remember @Yangqing strongly arguing against adding this option as it's often quite surprising.

Chillee · 2019-07-08T18:20:04Z

@apaszke Do you remember any of the reasons he gave for strongly warning against this?

@soumith mentioned that dynamically calculating padding is bad for a "variety of efficiency/serialization reasons", but these concerns don't seem relevant for research users.

On the other hand, for research users this is an unnecessary pain point. I've had multiple people complain that this isn't in Pytorch. I'm not convinced that the performance/serialization issues are worth the cost here.

soumith · 2019-07-08T19:02:47Z

yes, we want this. YQ's concerns were around exported models not being static, and the ONNX export pass needs to error out if the padding info cannot be statically determined (@Chillee you might have to add a test for the ONNX export).

At this point, the user need and benefits outweigh not wanting it, and our JIT / production export now can comfortably handle dynamic models (it wasn't true when we last thought about this)

fmassa · 2019-07-08T19:44:38Z

@Chillee I wasn't aware that circular padding was added only in the nn.Module interface.

I'm less sure we want to diverge the functionalities of both the module and functional interfaces, because some things are only possible to be done with the functional interface (for example, when we want to have the weights of a model to be the output of a network).

But I'll let @soumith and @apaszke chime in on their views

soumith · 2019-07-08T20:04:18Z

my main thinking around matching everything in the functional interface was around how much state we can carry to reuse previous calculations. We can carry state across forward calls in the Module, but not in the Functional interface.

I didn't explore the line of thought around how much state (such as pre-calculating padding values etc.) can be carried across iterations and how much benefits it brings.

Chillee · 2019-07-08T22:16:36Z

@fmassa One issue with putting it in the functional interface is that the semantics around auto-calculating "static padding" become fairly muddled. Currently, we can auto-calculate static padding and have it for free if certain rules are followed (stride=1, (filter-1)*dilation is divisible by 2).

It's not clear how we would do this in the functional interface.

I think the right place to add this in the functional interface would be in the F.pad function or something along those lines.

Chillee · 2019-07-09T05:09:21Z

Added tests and fixed JIT compatibility. Removed the WIP tag - I think this is ready for review.

fmassa · 2019-07-09T13:30:25Z

@Chillee

One issue with putting it in the functional interface is that the semantics around auto-calculating "static padding" become fairly muddled. Currently, we can auto-calculate static padding and have it for free if certain rules are followed (stride=1, (filter-1)*dilation is divisible by 2).

I don't quite get this. The overhead of calculation of if the padding is static or not is negligible wrt the convolution time, so doing this in the functional codepath would not be a problem at all.

It's not clear how we would do this in the functional interface.

Here is an example

def _conv2d(...):
     # same as conv2d now

def conv2d(...):
    is_static, padding = _compute_if_padding_is_static(...)
    if is_static:
        input = pad(input, padding, etc)
        return _conv2d(input, ...)
    return _conv2d(input, ...)

I think the right place to add this in the functional interface would be in the F.pad function or something along those lines.

I actually think we should instead add support for different paddings on left / right / top / bottom. This would be a generalization of what we currently have.
But this is a bit more involved to do.

Chillee · 2019-07-09T21:57:06Z

@fmassa I guess it wasn't clear to me what "static padding" actually meant. If that's still considered static padding, then I'm fine with doing it in the functional interface. Would you be able to provide a pointer to what I should be changing?

Also, as a side note, the documentation for F.conv2d isn't quite correct: https://pytorch.org/docs/stable/nn.html#id22

It says it supports padding_mode but it doesn't.

spandantiwari · 2019-08-25T20:13:03Z

yes, we want this. YQ's concerns were around exported models not being static, and the ONNX export pass needs to error out if the padding info cannot be statically determined (@Chillee you might have to add a test for the ONNX export).

At this point, the user need and benefits outweigh not wanting it, and our JIT / production export now can comfortably handle dynamic models (it wasn't true when we last thought about this)

I agree with @soumith 's comment in that ONNX Conv op will support similar semantics as same being here, with the auto_pad option SAME_UPPER. This option was previously on path to deprecation, but due to clear use cases has since been retained in the design. Given that I feel that this can be handled for ONNX export.

eellison · 2019-09-16T18:39:36Z

@Chillee do you want someone else to take on finishing this ? It would be nice to get this in for 1.3

Chillee · 2019-09-16T18:53:12Z

@eellison I've been busy (and will be busy) until the 23rd, but I should be able to finish it up in the week after that (which should be at least a week before 1.3?).

eellison · 2019-09-19T20:10:23Z

I think the cut is the 24th actually... although potentially we could cherry-pick this in

Chillee · 2019-09-19T20:15:08Z

Ah alright I'll take a crack tonight at fixing it then.

eellison · 2019-09-23T15:07:51Z

@ZolotukhinM @jamesr66a @raghuramank100 this PR breaks convolution tests. Do you think it's doable to land this PR before the release and have quant parity with it ?

Chillee · 2019-09-23T15:11:37Z

To clarify a bit more, the tests that break are the ones that look like (the last line specifically).

        FileCheck().check('ClassType<Observer> = prim::GetAttr[name="observer_for_') \
                   .check_next('prim::CallMethod[name="forward"](%observer_for_') \
                   .check('ClassType<Observer> = prim::GetAttr[name="observer_for_') \
                   .check_next('prim::CallMethod[name="forward"](%observer_for_') \
                   .check('Tensor = aten::conv2d') \

The tests pass as long as I comment that out - it's not clear to me if that's the right thing to do.

eellison · 2019-09-24T14:57:36Z

@Chillee quantization has a mirror'd conv implementation so to maintain parity they have to do more than fix the tests. Since the branch cut is today/tomorrow quant people (cc @jamesr66a) think it would be better to wait until after the cut to land rather than rush in a quant fix.

I forgot that quantization needed to maintain parity when I commented above about getting it into the release, that's my bad.

Yangqing · 2019-10-09T04:52:21Z

Not that it matters, but I have a python notebook detailing the design choices:

https://gist.github.com/Yangqing/47772de7eb3d5dbbff50ffb0d7a98964

As an architect, I still believe that the SAME padding is not something that we should introduce back into the common usage patterns. This being said, I was the person that included SAME paddings in ONNX for compatibility reasons, so I can understand why one might want it.

ZolotukhinM · 2019-10-09T18:11:56Z

I somehow missed the notification 16 days ago, sorry.

To clarify a bit more, the tests that break are the ones that look like (the last line specifically).
        FileCheck().check('ClassType<Observer> = prim::GetAttr[name="observer_for_') \
                   .check_next('prim::CallMethod[name="forward"](%observer_for_') \
                   .check('ClassType<Observer> = prim::GetAttr[name="observer_for_') \
                   .check_next('prim::CallMethod[name="forward"](%observer_for_') \
                   .check('Tensor = aten::conv2d') \
The tests pass as long as I comment that out - it's not clear to me if that's the right thing to do.

Could you please paste what graph do we actually have there? To print it, you could add

print(str(m._c._get_module("conv")._get_method('conv2d_forward').graph))

before the FileCheck invocation.

I'm pretty sure this test can be adjusted to the changes you're doing - it basically just checks that after the transformation we have instrumentation calls to "observers" before the conv call that we're instrumenting. It essentially looks for specific strings and my guess is that conv now looks slightly differently.

eellison · 2019-11-11T18:47:23Z

@Chillee , @ZolotukhinM @jamesr66a what can we do to land this

Chillee · 2019-11-18T04:47:01Z

Sorry I've been busy recently - will get back to this within a week or two.

Chillee · 2022-02-10T00:39:23Z

Somebody else pushed it across the finish line when I couldn't :) #45667

Added same padding for conv*d

5d695a4

Chillee requested review from fmassa and soumith July 3, 2019 07:20

pytorchbot added the module: nn Related to torch.nn label Jul 3, 2019

Chillee requested a review from jamesr66a July 3, 2019 07:29

Chillee force-pushed the paddingsame branch 2 times, most recently from 6764c31 to 5d695a4 Compare July 3, 2019 07:33

fmassa requested changes Jul 3, 2019

View reviewed changes

torch/nn/modules/conv.py Outdated Show resolved Hide resolved

Updated documentation

91765f0

Chillee force-pushed the paddingsame branch from aa07db3 to 91765f0 Compare July 8, 2019 20:16

Added static padding calculation when possible

65c77f9

Chillee added 3 commits July 8, 2019 16:57

Restructured code to support the JIT

dd40227

Added tests

65761de

Fix lint

9474b15

Chillee changed the title ~~[WIP] Added "same" padding for conv*d~~ Added "same" padding for conv*d Jul 9, 2019

Chillee requested a review from fmassa July 9, 2019 05:49

Started moving implementation to functional module

c44d823

pytorchbot added module: operators module: internals Related to internal abstractions in c10 and ATen labels Jul 13, 2019

Chillee force-pushed the paddingsame branch from 9f3b48d to 0f53738 Compare July 14, 2019 06:01

Chillee force-pushed the paddingsame branch from 0506a1f to 7841a4d Compare August 21, 2019 19:43

Fixed stupid issue with test

5a84d5f

Chillee requested a review from apaszke as a code owner August 29, 2019 22:23

Chillee force-pushed the paddingsame branch from 5d879db to d743bf2 Compare August 29, 2019 23:26

Fixed some issues

42ca33c

Chillee force-pushed the paddingsame branch from d743bf2 to 42ca33c Compare August 30, 2019 19:03

Chillee added 3 commits August 30, 2019 15:33

Merge branch 'master' of github.com:pytorch/pytorch into paddingsame

0c28ce4

Merge branch 'master' of github.com:pytorch/pytorch into paddingsame

731119a

Merge branch 'master' of github.com:pytorch/pytorch into paddingsame

3fb6287

Chillee added 3 commits September 20, 2019 14:44

Merge branch 'master' of github.com:pytorch/pytorch into paddingsame

d90e816

rerun tests

3ae61e0

Fixed tests

aa119fd

jamesr66a removed their request for review November 20, 2019 19:27

mruberry removed the module: operators (deprecated) label Oct 10, 2020

facebook-github-bot added the cla signed label Oct 30, 2020

Chillee closed this Feb 10, 2022

github-actions bot deleted the paddingsame branch February 6, 2024 02:09

Added "same" padding for conv*d #22484

Added "same" padding for conv*d #22484

Uh oh!

Conversation

Chillee commented Jul 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Chillee commented Jul 6, 2019

Uh oh!

apaszke commented Jul 8, 2019

Uh oh!

Chillee commented Jul 8, 2019

Uh oh!

soumith commented Jul 8, 2019

Uh oh!

fmassa commented Jul 8, 2019

Uh oh!

soumith commented Jul 8, 2019

Uh oh!

Chillee commented Jul 8, 2019

Uh oh!

Chillee commented Jul 9, 2019

Uh oh!

fmassa commented Jul 9, 2019

Uh oh!

Chillee commented Jul 9, 2019

Uh oh!

spandantiwari commented Aug 25, 2019

Uh oh!

eellison commented Sep 16, 2019

Uh oh!

Chillee commented Sep 16, 2019

Uh oh!

eellison commented Sep 19, 2019

Uh oh!

Chillee commented Sep 19, 2019

Uh oh!

eellison commented Sep 23, 2019

Uh oh!

Chillee commented Sep 23, 2019

Uh oh!

eellison commented Sep 24, 2019

Uh oh!

Yangqing commented Oct 9, 2019

Uh oh!

ZolotukhinM commented Oct 9, 2019

Uh oh!

eellison commented Nov 11, 2019

Uh oh!

Chillee commented Nov 18, 2019

Uh oh!

Chillee commented Feb 10, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Chillee commented Jul 3, 2019 •

edited

Loading