Skip to content

Conversation

@karthickai
Copy link
Collaborator

@karthickai karthickai commented Aug 14, 2025

This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in this comment.

Changes Included

  • Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination.
  • Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor.
  • Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler.
  • Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code.
  • Added test cases to verify both "should throw" and "should not throw" scenarios.

Fixes #147282

cc @SherlockNoMad @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 14, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160677

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit cc442eb with merge base c55bdb2 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@karthickai karthickai added the module: decompositions Topics related to decomposition (excluding PrimTorch) label Aug 14, 2025
)


class DeviceAssert(ExternKernel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fuse the device assertions. Which means it should be an assertion, and we avoid DCEing buffers with the assertion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. Removed the ExternKernel approach and changed it to use the ops handler method and modified has_side_effect method to return True to avoid DCE. (b42cb7c)

@karthickai karthickai marked this pull request as ready for review August 18, 2025 21:06
@karthickai karthickai requested a review from mlazos August 18, 2025 21:31
@karthickai karthickai force-pushed the viable/strict branch 2 times, most recently from 5bea603 to 71ca719 Compare August 21, 2025 16:44
Copy link
Contributor

@mlazos mlazos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to define this on FusedSchedulerNode as well, to check all of its snodes. (although its unlikely we will fuse multiple nodes then dce them)

)
return buffers_store_as_atomic_add

def has_side_effects(self) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might as well cache_on_self this, because it doesnt change after instantiation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I added has_side_effects method in FusedSchedulerNode and used cache_on_self.


@cache_on_self
def has_side_effects(self) -> bool:
if self.snodes is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should just be any(node.has_side_effects() for node in self.snodes)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, fixed it.

@karthickai
Copy link
Collaborator Author

@pytorchbot retest this please

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'retest' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

@karthickai karthickai changed the base branch from viable/strict to main August 22, 2025 05:18
@karthickai
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 22, 2025
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@karthickai
Copy link
Collaborator Author

@pytorchbot merge

@mlazos
Copy link
Contributor

mlazos commented Aug 28, 2025

@karthickai can you resolve https://github.com/pytorch/pytorch/pull/160677/files#r2299347303 before merging again?

@mlazos mlazos self-requested a review August 28, 2025 01:34
@facebook-github-bot
Copy link
Contributor

@karthickai has imported this pull request. If you are a Meta employee, you can view this in D81197669.

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: Meta Internal-Only Changes Check

Details for Dev Infra team Raised by workflow job

@karthickai
Copy link
Collaborator Author

@pytorchbot merge -f "landed in fbcode"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

except Exception:
return False

bisect_result = CompilerBisector.do_bisect(test_fn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - getting to this late.

I would avoid using CompilerBisector in this manner. If the intent is to test that different backends all threw, you can parameterize the test by backend so that we get a more explicit failure on which backend is not throwing. This would also allow you to test that "should throw" is in the error msg.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your point, I will update the tests.

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ch.compile (pytorch#160677)

This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)).

Changes Included

- Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination.
- Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor.
- Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler.
- Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code.
- Added test cases to verify both "should throw" and "should not throw" scenarios.

Fixes pytorch#147282

Pull Request resolved: pytorch#160677
Approved by: https://github.com/mlazos
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ch.compile (pytorch#160677)

This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)).

Changes Included

- Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination.
- Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor.
- Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler.
- Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code.
- Added test cases to verify both "should throw" and "should not throw" scenarios.

Fixes pytorch#147282

Pull Request resolved: pytorch#160677
Approved by: https://github.com/mlazos
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ch.compile (pytorch#160677)

This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)).

Changes Included

- Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination.
- Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor.
- Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler.
- Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code.
- Added test cases to verify both "should throw" and "should not throw" scenarios.

Fixes pytorch#147282

Pull Request resolved: pytorch#160677
Approved by: https://github.com/mlazos, https://github.com/atalman
karthickai added a commit that referenced this pull request Sep 24, 2025
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)).



cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos choijon5 

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Sep 24, 2025
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)).

Pull Request resolved: #163696
Approved by: https://github.com/mlazos
jainapurva pushed a commit that referenced this pull request Sep 29, 2025
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)).

Pull Request resolved: #163696
Approved by: https://github.com/mlazos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: decompositions Topics related to decomposition (excluding PrimTorch) module: inductor release notes: inductor Reverted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assert fails to trigger inside torch.compile

6 participants