Re-land "Fix advanced indexing on "huge" Tensors" #21019

colesbury · 2019-05-28T15:53:17Z

This is #20919 without the changes to aten/src/THC/THCIntegerDivider.cuh
that broke the ROCm build.

Original summary:

This fixes advanced indexing in cases where there's more than 2^31-1
bytes in the output. The gpu_index_kernel was missing the
can_use_32bit_indexing/with_32bit_indexing check.

This also adds a number of TORCH_INTERNAL_ASSERTS in Loops.cuh,
OffsetCalculator, and IntDivider that sizes are fit in a signed 32-bit
integer.

More comprehensive tests that require a 32 GB GPU are here:
https://gist.github.com/colesbury/e29387f5851521256dff562be07b981e

This pytorch#20919 without the changes to aten/src/THC/THCIntegerDivider.cuh that broke the ROCm build. Original summary: This fixes advanced indexing in cases where there's more than 2^31-1 bytes in the output. The `gpu_index_kernel` was missing the `can_use_32bit_indexing`/`with_32bit_indexing` check. This also adds a number of TORCH_INTERNAL_ASSERTS in Loops.cuh, OffsetCalculator, and IntDivider that sizes are fit in a signed 32-bit integer. More comprehensive tests that require a 32 GB GPU are here: https://gist.github.com/colesbury/e29387f5851521256dff562be07b981e

facebook-github-bot

@colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

bddppq

Thanks!

facebook-github-bot · 2019-05-28T20:35:42Z

@colesbury merged this pull request in b85c529.

Summary: This #20919 without the changes to aten/src/THC/THCIntegerDivider.cuh that broke the ROCm build. cc bddppq Original summary: This fixes advanced indexing in cases where there's more than 2^31-1 bytes in the output. The `gpu_index_kernel` was missing the `can_use_32bit_indexing`/`with_32bit_indexing` check. This also adds a number of TORCH_INTERNAL_ASSERTS in Loops.cuh, OffsetCalculator, and IntDivider that sizes are fit in a signed 32-bit integer. More comprehensive tests that require a 32 GB GPU are here: https://gist.github.com/colesbury/e29387f5851521256dff562be07b981e Pull Request resolved: pytorch/pytorch#21019 Differential Revision: D15518477 Pulled By: colesbury fbshipit-source-id: 4db5626fda76eb58250793e8aa7d4f2832db3a34

pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators labels May 28, 2019

facebook-github-bot reviewed May 28, 2019

View reviewed changes

bddppq approved these changes May 28, 2019

View reviewed changes

facebook-github-bot closed this in b85c529 May 28, 2019

facebook-github-bot added the merged label May 28, 2019

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Re-land "Fix advanced indexing on "huge" Tensors" #21019

Re-land "Fix advanced indexing on "huge" Tensors" #21019

Uh oh!

colesbury commented May 28, 2019 •

edited

Loading

Uh oh!

facebook-github-bot left a comment

Uh oh!

bddppq left a comment

Uh oh!

facebook-github-bot commented May 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Re-land "Fix advanced indexing on "huge" Tensors" #21019

Re-land "Fix advanced indexing on "huge" Tensors" #21019

Uh oh!

Conversation

colesbury commented May 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

bddppq left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 28, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

colesbury commented May 28, 2019 •

edited

Loading