Skip to content

Conversation

@ailzhang
Copy link
Contributor

@ailzhang ailzhang commented May 23, 2019

#20691 discussion.

Main problem here is if x is marked as Optional[BroadcastingList2[int]], when it's None, JIT has no way to know how to construct a int[2] out of it. So we will have to annotate it properly in python.

@pytorchbot pytorchbot added oncall: jit Add this issue/PR to JIT oncall triage queue module: cuda Related to torch.cuda, and CUDA support in general module: nn Related to torch.nn module: operators labels May 23, 2019
@ailzhang
Copy link
Contributor Author

cc: @skrah @ezyang
hmmm I guess the commit history is screwed up here but the diff is fine. Let me know if that matters. :P

@ailzhang ailzhang requested a review from ezyang May 23, 2019 16:59
out = avg_pool2d(input.pow(norm_type), kernel_size, padding=0, ceil_mode=ceil_mode)
kw, kh = _pair(kernel_size)
if stride is None:
stride = torch.jit.annotate(List[int], kernel_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modest question: Why did we annotate this as a List[int] instead of a BroadcastingList2[int]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried :P but it turns out BroadcastingList2[int] is not a valid option in torch.jit.annotate. talked to @driazati about it, it's mainly because BroadcastingList<x> is a "hacky" workaround and we only supports x up to 5.

@ezyang
Copy link
Contributor

ezyang commented May 23, 2019

Looks reasonable I guess? I don't know enough about what annotate does to be able to properly review this. Let me know if you want me to figure out what exactly it's doing.

@ailzhang ailzhang requested a review from driazati May 23, 2019 19:44
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@skrah
Copy link
Contributor

skrah commented May 24, 2019

I think the integration tests are still failing:

May 24 07:47:30 C++ exception with description "max_pool2d_with_indices: internal error: all IntArrayRef sizes must be 2 (max_pool2d_with_indices_out_cuda_template at /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/DilatedMaxPool2d.cu:155)

These are the ones that pass padding.size()==1 and dilation.size()==1. I'm not sure if the JIT is involved here at all.

@skrah
Copy link
Contributor

skrah commented May 24, 2019

For reference, this is the sort of call in the tests that leads to padding.size()==1 etc.:

x = torch::max_pool2d(conv1->forward(x), {2, 2}).relu();

"""
if stride is None:
stride = torch.jit.annotate(List[int], [])
stride = torch.jit.annotate(List[int], kernel_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BroadcastingListx[T] gets removed and replaced with List[T] when the types are resolved during schema-making, so this can just be stride = kernel_size

@skrah
Copy link
Contributor

skrah commented Jun 21, 2019

Looking at #22032 I'm starting to think that perhaps the C++ functions in ATen/native should not expect IntArrayRefs of a guaranteed size.

A brief summary of the issue is that

https://github.com/pytorch/examples/blob/1de2ff9338bacaaffa123d03ce53d7522d5dcc2e/cpp/mnist/mnist.cpp#L40

also expects to be able to pass a single integer for kernel_size.

This means that the error checking (and single integer argument expansion) in DilatedMaxPool2d that is currently classified as a workaround could be part of the published C++ API.

In that case the JIT would also not need to special case anything (assuming that is easier for the JIT, which I'm not sure about).

@ezyang
Copy link
Contributor

ezyang commented Jun 24, 2019 via email

@ailzhang
Copy link
Contributor Author

ailzhang commented Jun 24, 2019

Yea I agree with @ezyang . Basically this PR ensured we do the expansion for users in JIT. To provide the same benefit for our cpp frontend users, I think it's preferred to be in our codegen so that we don't have to handle this in every kernel. I remember I made some progress on that part, I will try to get back to it this week. cc @yf225 to see if he has a better idea here.

@ailzhang
Copy link
Contributor Author

ailzhang commented Jun 25, 2019

Actually I realized codegen is not enough if we want to provide the same convenience to cpp frontend users.
Take max_pool2d(..., IntArrayRef stride={}, IntArrayRef padding={0}, IntArrayRef dilation={1}, ...) as example, we can use codegen to expand padding to {0, 0} and dilation to {1, 1}. (Note to self, this could be done in aten/src/ATen/function_wrapper.py.

  1. However this cannot handle stride case as it doesn't have a default case to expand. And the true logic to handle empty init like stride could vary case by case(init as all zeros/ones/...). So this still cannot guarantee we always give initialized stride array to cpp impl.
  2. Even we manage to get all the function declaration initialized correctly from simplified representation in native_functions.yaml, this codegen stuff won't work on user end. As cpp frontend user type padding=0, dilation=2, the only way we can handle the expansion is in cpp code.
    Thus I think it's reasonable to say that our aten implementation should take care of these expansions.
  3. If we think about adding some logic before invoking this, it's definitely not trivial, as it could vary case by case how to initialize an empty stride. For max_pool it should be the same as kernel_size if empty, what if in other functions it should be initialized as all zeros? In that case it might be simpler to keep this implementation detail in the cpp implementation.

On the other hand, I think it might be reasonable to expect all inputs from JIT frontend to be properly initialized, as JIT is supposed to be statically typed (it doesn't take int into List[int], this should be already handled by BroadcastingListx). So I think this PR that fixes the issue on JIT side still makes sense.
@ezyang Let me know if these makes sense or not :D I will rebase and only fix the JIT part in this PR if it sounds good.
[Update:] On a second thought, if we already enforce our aten implementation should take care of these expansions, is it reasonable for JIT to also take advantage of that? If we allow it, I'm also happy to discard this PR.

@ezyang
Copy link
Contributor

ezyang commented Jun 26, 2019

As a short term fix, I don't mind just going through and fixing the ATen implementations to do expansion as necessary. (It will be good to write some utility functions first so we aren't doing hella copy pasting).

If we were going to solve this by codegen, I would have imagined some sort of stub function that interposes between Tensor::foo and at::native::foo that does expansions and then passes things along. But I do agree this is a bit involved, and maybe better to just do the simple thing.

And yes, if we do force C++ to handle expansion, we should remove the logic from JIT and PyTorch. The thing that I am concerned about is that we had better do a good job auditing and testing all the places we need to do this, or there are going to be a lot of segfaults in our future.

@ailzhang
Copy link
Contributor Author

Thanks @ezyang ! Yea I totally agree some helper functions would be best short term solution. cc @skrah maybe you can set these up in one PR so that they're consistently used in TH porting?
I'm going to close this PR for now, since ATen is guaranteed to handle expansion, JIT doesn't have to repeat the work.

@skrah
Copy link
Contributor

skrah commented Jun 26, 2019

@ailzhang I opened #22277 as a catch-all issue, I can start a PR around mid-July if no one beats me to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: cuda Related to torch.cuda, and CUDA support in general module: nn Related to torch.nn oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants