Make CUDACachingAllocator::recordStream() a no-op on null ptrs #20658

mrshenli · 2019-05-17T20:26:40Z

Fixes #20651

Communication collectives in torch.distributed call CUDACachingAllocator::recordStream() on input and output tensors to prevent their memory blocks being freed too early. CUDACachingAllocator uses tensor's data pointer to track memory blocks, which does not accept null pointers. However, empty tensor's storage().data() might be null. In this case, as there is no associated memory block for the empty tensor, it should be fine to make recordStream() a no-op.

Tests only cover broadcast empty tensors for GLOO backend, because GLOO does not support empty inputs (pytorch/gloo/issues/179). It can be addressed in either ProcessGroupGloo or GLOO itself. Will add more tests when that gap is filled.

facebook-github-bot

@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-05-20T17:39:00Z

@mrshenli merged this pull request in 8acaa28.

Make CUDACachingAllocator::recordStream() a no-op on null ptrs

06eb014

mrshenli requested a review from colesbury May 17, 2019 20:26

mrshenli requested review from apaszke and pietern as code owners May 17, 2019 20:26

pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general oncall: distributed Add this issue/PR to distributed oncall triage queue module: internals Related to internal abstractions in c10 and ATen labels May 17, 2019

mrshenli added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 17, 2019

facebook-github-bot reviewed May 17, 2019

View reviewed changes

address test failures

5de2414

facebook-github-bot reviewed May 17, 2019

View reviewed changes

colesbury approved these changes May 17, 2019

View reviewed changes

facebook-github-bot closed this in 8acaa28 May 20, 2019

facebook-github-bot added the merged label May 20, 2019

mrshenli mentioned this pull request Nov 22, 2019

Check DataPtr's deleter to determine if it is allocated by CUDA in record_stream #27405

Closed

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make CUDACachingAllocator::recordStream() a no-op on null ptrs #20658

Make CUDACachingAllocator::recordStream() a no-op on null ptrs #20658

Uh oh!

mrshenli commented May 17, 2019

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented May 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Make CUDACachingAllocator::recordStream() a no-op on null ptrs #20658

Make CUDACachingAllocator::recordStream() a no-op on null ptrs #20658

Uh oh!

Conversation

mrshenli commented May 17, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants