DeepCompile: Specify tensor aliasing in C++ op schema #7597
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PyTorch C++ op schema [1] allows specifying tensor storage aliasing by annotating
(a)after input/output types. Torch inductor takes this information to determine where to insert explicitdelstatements for tensors that are no longer needed.If what an op schema specifies disagrees with the op implementation, inductor-generated code is likely to release tensors earlier than expected and leads to wrong results.
wait_allgatherandrelease_paramreturn the first argument unchanged and that aliasing should be annotated in the schema.Also remove the code related to
clone_custom_op_outputas it is solely a workaround of the aforementioned issue.[1] https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/README.md
Fixes: #7596