Fix annotation for Optional[BroadcastingListx] #20866

ailzhang · 2019-05-23T16:55:13Z

#20691 discussion.

Main problem here is if x is marked as Optional[BroadcastingList2[int]], when it's None, JIT has no way to know how to construct a int[2] out of it. So we will have to annotate it properly in python.

…st.MNIST.

…ted_max_pool2d

ailzhang · 2019-05-23T16:58:14Z

cc: @skrah @ezyang
hmmm I guess the commit history is screwed up here but the diff is fine. Let me know if that matters. :P

ezyang · 2019-05-23T18:38:37Z

torch/nn/functional.py

-        out = avg_pool2d(input.pow(norm_type), kernel_size, padding=0, ceil_mode=ceil_mode)
+    kw, kh = _pair(kernel_size)
+    if stride is None:
+        stride = torch.jit.annotate(List[int], kernel_size)


Modest question: Why did we annotate this as a List[int] instead of a BroadcastingList2[int]?

I tried :P but it turns out BroadcastingList2[int] is not a valid option in torch.jit.annotate. talked to @driazati about it, it's mainly because BroadcastingList<x> is a "hacky" workaround and we only supports x up to 5.

ezyang · 2019-05-23T18:39:16Z

Looks reasonable I guess? I don't know enough about what annotate does to be able to properly review this. Let me know if you want me to figure out what exactly it's doing.

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

skrah · 2019-05-24T12:22:52Z

I think the integration tests are still failing:

May 24 07:47:30 C++ exception with description "max_pool2d_with_indices: internal error: all IntArrayRef sizes must be 2 (max_pool2d_with_indices_out_cuda_template at /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/DilatedMaxPool2d.cu:155)

These are the ones that pass padding.size()==1 and dilation.size()==1. I'm not sure if the JIT is involved here at all.

skrah · 2019-05-24T12:37:25Z

For reference, this is the sort of call in the tests that leads to padding.size()==1 etc.:

pytorch/test/cpp/api/integration.cpp

Line 303 in 70ecddf

x = torch::max_pool2d(conv1->forward(x), {2, 2}).relu();

driazati · 2019-05-24T17:03:42Z

torch/nn/functional.py

    """
    if stride is None:
-        stride = torch.jit.annotate(List[int], [])
+        stride = torch.jit.annotate(List[int], kernel_size)


The BroadcastingListx[T] gets removed and replaced with List[T] when the types are resolved during schema-making, so this can just be stride = kernel_size

skrah · 2019-06-21T09:28:23Z

Looking at #22032 I'm starting to think that perhaps the C++ functions in ATen/native should not expect IntArrayRefs of a guaranteed size.

A brief summary of the issue is that

https://github.com/pytorch/examples/blob/1de2ff9338bacaaffa123d03ce53d7522d5dcc2e/cpp/mnist/mnist.cpp#L40

also expects to be able to pass a single integer for kernel_size.

This means that the error checking (and single integer argument expansion) in DilatedMaxPool2d that is currently classified as a workaround could be part of the published C++ API.

In that case the JIT would also not need to special case anything (assuming that is easier for the JIT, which I'm not sure about).

ezyang · 2019-06-24T15:16:18Z

Ideally, I suppose, we would add replication logic in a layer right before we invoke the C++ native implementation. But if this is difficult to do (it is at least not trivial), the easiest fix is to just change the C++ kernels to do an expansion. If we have some helper functions for this, it should really just be a one-liner. Excerpts from Stefan Krah's message of 2019-06-21 02:29:29 -0700:

…

Looking at #22032 I'm starting to think that perhaps the C++ functions in `ATen/native` should not expect `IntArrayRef`s of a guaranteed size. A brief summary of the issue is that https://github.com/pytorch/examples/blob/1de2ff9338bacaaffa123d03ce53d7522d5dcc2e/cpp/mnist/mnist.cpp#L40 also expects to be able to pass a single integer for `kernel_size`. This means that the error checking in `DilatedMaxPool2d` that is currently classified as a workaround could be part of the published C++ API. In that case the JIT would also not need to special case anything (assuming that is easier for the JIT, which I'm not sure about).

ailzhang · 2019-06-24T17:14:31Z

Yea I agree with @ezyang . Basically this PR ensured we do the expansion for users in JIT. To provide the same benefit for our cpp frontend users, I think it's preferred to be in our codegen so that we don't have to handle this in every kernel. I remember I made some progress on that part, I will try to get back to it this week. cc @yf225 to see if he has a better idea here.

ailzhang · 2019-06-25T05:41:15Z

Actually I realized codegen is not enough if we want to provide the same convenience to cpp frontend users.
Take max_pool2d(..., IntArrayRef stride={}, IntArrayRef padding={0}, IntArrayRef dilation={1}, ...) as example, we can use codegen to expand padding to {0, 0} and dilation to {1, 1}. (Note to self, this could be done in aten/src/ATen/function_wrapper.py.

However this cannot handle stride case as it doesn't have a default case to expand. And the true logic to handle empty init like stride could vary case by case(init as all zeros/ones/...). So this still cannot guarantee we always give initialized stride array to cpp impl.
Even we manage to get all the function declaration initialized correctly from simplified representation in native_functions.yaml, this codegen stuff won't work on user end. As cpp frontend user type padding=0, dilation=2, the only way we can handle the expansion is in cpp code.
Thus I think it's reasonable to say that our aten implementation should take care of these expansions.
If we think about adding some logic before invoking this, it's definitely not trivial, as it could vary case by case how to initialize an empty stride. For max_pool it should be the same as kernel_size if empty, what if in other functions it should be initialized as all zeros? In that case it might be simpler to keep this implementation detail in the cpp implementation.

On the other hand, I think it might be reasonable to expect all inputs from JIT frontend to be properly initialized, as JIT is supposed to be statically typed (it doesn't take int into List[int], this should be already handled by BroadcastingListx). So I think this PR that fixes the issue on JIT side still makes sense.
@ezyang Let me know if these makes sense or not :D I will rebase and only fix the JIT part in this PR if it sounds good.
[Update:] On a second thought, if we already enforce our aten implementation should take care of these expansions, is it reasonable for JIT to also take advantage of that? If we allow it, I'm also happy to discard this PR.

ezyang · 2019-06-26T17:37:44Z

As a short term fix, I don't mind just going through and fixing the ATen implementations to do expansion as necessary. (It will be good to write some utility functions first so we aren't doing hella copy pasting).

If we were going to solve this by codegen, I would have imagined some sort of stub function that interposes between Tensor::foo and at::native::foo that does expansions and then passes things along. But I do agree this is a bit involved, and maybe better to just do the simple thing.

And yes, if we do force C++ to handle expansion, we should remove the logic from JIT and PyTorch. The thing that I am concerned about is that we had better do a good job auditing and testing all the places we need to do this, or there are going to be a lot of segfaults in our future.

ailzhang · 2019-06-26T17:49:46Z

Thanks @ezyang ! Yea I totally agree some helper functions would be best short term solution. cc @skrah maybe you can set these up in one PR so that they're consistently used in TH porting?
I'm going to close this PR for now, since ATen is guaranteed to handle expansion, JIT doesn't have to repeat the work.

skrah · 2019-06-26T20:31:44Z

@ailzhang I opened #22277 as a catch-all issue, I can start a PR around mid-July if no one beats me to it.

skrah and others added 10 commits May 20, 2019 00:15

Port dilated_max_pool2d() to ATen.

774abb0

Remove SpatialMaxPooling.cu, which appears to be unused.

890c835

Add workaround for non-expanded padding and dilation in IntegrationTe…

1bdf43b

…st.MNIST.

Merge branch 'master' of https://github.com/pytorch/pytorch into dila…

f514c40

…ted_max_pool2d

Replace questionable IntArrayRef reassignment.

8a263f4

Merge branch 'master' of https://github.com/pytorch/pytorch into dila…

cc421c0

…ted_max_pool2d

Merge branch 'master' of https://github.com/pytorch/pytorch into dila…

082f8db

…ted_max_pool2d

Delete aten/src/THCUNN/SpatialDilatedMaxPooling.cu.

6c01d1d

fix annotation for Optional[BroadcastingListx]

5d251f6

merge with master

c87dc83

pytorchbot added oncall: jit Add this issue/PR to JIT oncall triage queue module: cuda Related to torch.cuda, and CUDA support in general module: nn Related to torch.nn module: operators labels May 23, 2019

ailzhang requested a review from ezyang May 23, 2019 16:59

ezyang reviewed May 23, 2019

View reviewed changes

ailzhang requested a review from driazati May 23, 2019 19:44

max_unpool3d doesn't have double backward

d4f351e

facebook-github-bot reviewed May 23, 2019

View reviewed changes

driazati reviewed May 24, 2019

View reviewed changes

skrah mentioned this pull request Jun 21, 2019

max_pool2d_with_indices: internal error #22032

Closed

skrah mentioned this pull request Jun 21, 2019

DilatedMaxPool: expand incomplete kernel_size for the C++ API #22073

Closed

ezyang mentioned this pull request Jun 24, 2019

Port avg_pool3d() to ATen #21732

Closed

merge with master

38208b6

revert aten changes

2f26d91

ailzhang closed this Jun 26, 2019

skrah mentioned this pull request Jun 26, 2019

Handle all IntArrayRef expansions in ATen #22277

Open

Fix annotation for Optional[BroadcastingListx] #20866

Fix annotation for Optional[BroadcastingListx] #20866

Uh oh!

Conversation

ailzhang commented May 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ailzhang commented May 23, 2019

Uh oh!

ezyang May 23, 2019

Choose a reason for hiding this comment

Uh oh!

ailzhang May 23, 2019

Choose a reason for hiding this comment

Uh oh!

ezyang commented May 23, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

skrah commented May 24, 2019

Uh oh!

skrah commented May 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

driazati May 24, 2019

Choose a reason for hiding this comment

Uh oh!

skrah commented Jun 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Jun 24, 2019 via email

Uh oh!

ailzhang commented Jun 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ailzhang commented Jun 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Jun 26, 2019

Uh oh!

ailzhang commented Jun 26, 2019

Uh oh!

skrah commented Jun 26, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ailzhang commented May 23, 2019 •

edited

Loading

skrah commented May 24, 2019 •

edited

Loading

skrah commented Jun 21, 2019 •

edited

Loading

ailzhang commented Jun 24, 2019 •

edited

Loading

ailzhang commented Jun 25, 2019 •

edited

Loading