Skip to content

Conversation

@zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented Oct 31, 2019

This fixes #28789

Only the first two elements of smem are used in this function but at the beginning, it resets all the C10_WARP_SIZE to 0. When the scalar_t is 64bit, it goes out of the total shared memory size which is sizeof(int) * C10_WARP_SIZE, although this does not lead to any failure in CI.

@zasdfgbnm zasdfgbnm marked this pull request as ready for review October 31, 2019 20:42
@zasdfgbnm zasdfgbnm requested review from mruberry and ngimel October 31, 2019 20:42
facebook-github-bot pushed a commit that referenced this pull request Nov 1, 2019
Summary:
This is independent from #28989, but when #28989 get landed, this fixes #28792 .
Pull Request resolved: #28995

Differential Revision: D18265797

Pulled By: soumith

fbshipit-source-id: 6dd7cffd05aa65e4b366f1c40b8bda0a633e3154
zdevito pushed a commit to zdevito/ATen that referenced this pull request Nov 1, 2019
Summary:
This is independent from pytorch/pytorch#28989, but when pytorch/pytorch#28989 get landed, this fixes pytorch/pytorch#28792 .
Pull Request resolved: pytorch/pytorch#28995

Differential Revision: D18265797

Pulled By: soumith

fbshipit-source-id: 6dd7cffd05aa65e4b366f1c40b8bda0a633e3154
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in d8d7af0.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Nov 1, 2019
Summary:
This fixes pytorch/pytorch#28789

Only the first two elements of `smem` are used in this function but at the beginning, it resets all the `C10_WARP_SIZE` to 0. When the `scalar_t` is 64bit, it goes out of the total shared memory size which is `sizeof(int) * C10_WARP_SIZE`, although this does not lead to any failure in CI.
Pull Request resolved: pytorch/pytorch#28989

Differential Revision: D18271598

Pulled By: ngimel

fbshipit-source-id: 38cc863722509892646f719efb05e2730a7d9ae1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CUDA-MEMCHECK] TestTorchDeviceTypeCUDA.test_kthvalue_cuda_float64 fails

4 participants