Skip to content

Conversation

@resistor
Copy link
Contributor

@ssnl
Copy link
Collaborator

ssnl commented Jul 20, 2018

Thanks! This is great! Do you want to fix the im2col, col2im, vol2col and col2vol ones as well? e.g., https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/Im2Col.c#L30-L31

@fmassa
Copy link
Member

fmassa commented Jul 20, 2018

I think we also want to apply a similar patch to THCUNN, and also handle it somewhere where CUDNN is dispatched, as the error message is not great there.

In [1]: import torch

In [2]: conv = torch.nn.Conv2d(1, 1, kernel_size=3, dilation=2, stride=2).cuda()

In [3]: tensor = torch.empty(1, 1, 4, 4).cuda()

In [4]: conv(tensor).shape
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-4-c6df65b7b5a7> in <module>()
----> 1 conv(tensor).shape

~/github/pytorch/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/github/pytorch/torch/nn/modules/conv.py in forward(self, input)
    299     def forward(self, input):
    300         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 301                         self.padding, self.dilation, self.groups)
    302
    303

RuntimeError: CuDNN error: CUDNN_STATUS_BAD_PARAM

In [5]: torch.backends.cudnn.enabled = False

In [6]: conv(tensor).shape
Out[6]: torch.Size([1, 1, 1, 1])

@resistor
Copy link
Contributor Author

@ssnl col2vol and vol2col don't have shape checks at all currently, but I can make the change to im2col and col2im.

@resistor
Copy link
Contributor Author

@pytorchbot test this again

@soumith
Copy link
Contributor

soumith commented Jul 20, 2018

@resistor pytorchbot retest this please

@resistor
Copy link
Contributor Author

@pytorchbot retest this please

2 similar comments
@soumith
Copy link
Contributor

soumith commented Jul 21, 2018

@pytorchbot retest this please

@resistor
Copy link
Contributor Author

@pytorchbot retest this please

@ssnl
Copy link
Collaborator

ssnl commented Jul 22, 2018

Awesome. Could you add a test in test_nn for this?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssnl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ssnl
Copy link
Collaborator

ssnl commented Jul 23, 2018

@pytorchbot retest this please

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssnl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@fmassa
Copy link
Member

fmassa commented Jul 23, 2018

Are making the same changes for THCUNN as well? The same bevahior happen in cuda tensors

@ezyang
Copy link
Contributor

ezyang commented Jul 23, 2018

maybe-uninitialized Werrors:

04:47:30 In file included from generic/TemporalRowConvolution.c:1:0,
04:47:30                  from /var/lib/jenkins/workspace/aten/src/TH/THGenerateFloatTypes.h:11,
04:47:30                  from /var/lib/jenkins/workspace/aten/src/THNN/init.cpp:180:
04:47:30 /var/lib/jenkins/workspace/aten/src/THNN/generic/TemporalRowConvolution.c: In function 'void THNN_DoubleTemporalRowConvolution_updateGradInput(THNNState*, THTensor*, THTensor*, THTensor*, THTensor*, THTensor*, THTensor*, int, int, int, bool)':
04:47:30 /var/lib/jenkins/workspace/aten/src/THNN/generic/TemporalRowConvolution.c:359:31: error: 'tgradOutput' may be used uninitialized in this function [-Werror=maybe-uninitialized]
04:47:30    THTensor_(free)(tgradOutput);
04:47:30                                ^
04:47:30 /var/lib/jenkins/workspace/aten/src/THNN/generic/TemporalRowConvolution.c:358:26: error: 'tinput' may be used uninitialized in this function [-Werror=maybe-uninitialized]
04:47:30    THTensor_(free)(tinput);

Should be simple enough to work around.

@resistor
Copy link
Contributor Author

@ezyang Those are existing errors.

@resistor
Copy link
Contributor Author

@fmassa I would prefer to address those in a separate change, as I am not currently setup to test CUDA.

@fmassa
Copy link
Member

fmassa commented Jul 24, 2018

@resistor sounds good to me, but I think it is also very important to have consistent behavior between CPU and CUDA.

@resistor resistor force-pushed the div-rtn branch 2 times, most recently from 3156543 to c54f112 Compare July 25, 2018 22:28
@resistor
Copy link
Contributor Author

@fmassa CUDA changes incorporated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@resistor has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@pjh5
Copy link
Contributor

pjh5 commented Jul 26, 2018

Copy link
Collaborator

@ssnl ssnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 27, 2018
…utions involving striding and dilation.

Summary: Pull Request resolved: pytorch/pytorch#9640

Differential Revision: D8948081

Pulled By: resistor

fbshipit-source-id: 06f2e3ad1bdb448be6f36577cb9bd27c884df595
jramseyer pushed a commit to jramseyer/pytorch that referenced this pull request Jul 30, 2018
…utions involving striding and dilation.

Summary: Pull Request resolved: pytorch#9640

Differential Revision: D8948081

Pulled By: resistor

fbshipit-source-id: 06f2e3ad1bdb448be6f36577cb9bd27c884df595
goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
…utions involving striding and dilation.

Summary: Pull Request resolved: pytorch#9640

Differential Revision: D8948081

Pulled By: resistor

fbshipit-source-id: 06f2e3ad1bdb448be6f36577cb9bd27c884df595
@ezyang ezyang added the merged label Jun 26, 2019
facebook-github-bot pushed a commit that referenced this pull request Jul 17, 2019
Summary:
Fixes #21935 by using the integer floor division that was introduced for convolution shapes in #9640. Without this fix, the pooling operators can produce a 1-element output in cases they shouldn't.

Disclaimer: I couldn't properly test it locally (it's not picking up the modified version for some reason). I'm marking this WIP until I checked what the CI tools say...
Pull Request resolved: #22304

Differential Revision: D16181955

Pulled By: ezyang

fbshipit-source-id: a2405372753572548b40616d1206848b527c8121
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 17, 2019
Summary:
Fixes pytorch/pytorch#21935 by using the integer floor division that was introduced for convolution shapes in pytorch/pytorch#9640. Without this fix, the pooling operators can produce a 1-element output in cases they shouldn't.

Disclaimer: I couldn't properly test it locally (it's not picking up the modified version for some reason). I'm marking this WIP until I checked what the CI tools say...
Pull Request resolved: pytorch/pytorch#22304

Differential Revision: D16181955

Pulled By: ezyang

fbshipit-source-id: a2405372753572548b40616d1206848b527c8121
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants