-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Inductor] Add DeviceAssert op to enable device-side assertion in torch.compile #160677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160677
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit cc442eb with merge base c55bdb2 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torch/_inductor/ir.py
Outdated
| ) | ||
|
|
||
|
|
||
| class DeviceAssert(ExternKernel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should fuse the device assertions. Which means it should be an assertion, and we avoid DCEing buffers with the assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback. Removed the ExternKernel approach and changed it to use the ops handler method and modified has_side_effect method to return True to avoid DCE. (b42cb7c)
5bea603 to
71ca719
Compare
mlazos
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
eellison
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to define this on FusedSchedulerNode as well, to check all of its snodes. (although its unlikely we will fuse multiple nodes then dce them)
| ) | ||
| return buffers_store_as_atomic_add | ||
|
|
||
| def has_side_effects(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might as well cache_on_self this, because it doesnt change after instantiation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I added has_side_effects method in FusedSchedulerNode and used cache_on_self.
|
|
||
| @cache_on_self | ||
| def has_side_effects(self) -> bool: | ||
| if self.snodes is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should just be any(node.has_side_effects() for node in self.snodes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, fixed it.
|
@pytorchbot retest this please |
|
❌ 🤖 pytorchbot command failed: Try |
|
@pytorchbot merge |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
|
@karthickai can you resolve https://github.com/pytorch/pytorch/pull/160677/files#r2299347303 before merging again? |
8a357fd to
37142ff
Compare
|
@karthickai has imported this pull request. If you are a Meta employee, you can view this in D81197669. |
|
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: Meta Internal-Only Changes Check Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -f "landed in fbcode" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
| except Exception: | ||
| return False | ||
|
|
||
| bisect_result = CompilerBisector.do_bisect(test_fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - getting to this late.
I would avoid using CompilerBisector in this manner. If the intent is to test that different backends all threw, you can parameterize the test by backend so that we get a more explicit failure on which backend is not throwing. This would also allow you to test that "should throw" is in the error msg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your point, I will update the tests.
…ch.compile (pytorch#160677) This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)). Changes Included - Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination. - Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor. - Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler. - Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code. - Added test cases to verify both "should throw" and "should not throw" scenarios. Fixes pytorch#147282 Pull Request resolved: pytorch#160677 Approved by: https://github.com/mlazos
…n in torch.compile (pytorch#160677)" This reverts commit cddcaa1. Reverted pytorch#160677 on behalf of https://github.com/karthickai due to This is breaking tests on Rocm ([comment](pytorch#160677 (comment)))
…ch.compile (pytorch#160677) This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)). Changes Included - Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination. - Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor. - Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler. - Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code. - Added test cases to verify both "should throw" and "should not throw" scenarios. Fixes pytorch#147282 Pull Request resolved: pytorch#160677 Approved by: https://github.com/mlazos
…n in torch.compile (pytorch#160677)" This reverts commit 378edb0. Reverted pytorch#160677 on behalf of https://github.com/atalman due to new test is failing internally ([comment](pytorch#160677 (comment)))
…ch.compile (pytorch#160677) This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in [this comment](pytorch#147282 (comment)). Changes Included - Implemented device_assert op and overrides has_side_effect to return True to avoid removal by dead code elimination. - Commented out the assert_async_msg_decomp and functional_assert_async_msg_decomp decompositions to disable the default assert decomposition inside Inductor. - Added lowering for torch.ops.aten._assert_async.msg to convert assert calls into the ops_handler. - Implemented the codegen method for the device_assert op. This supports generating C++ and Triton code. - Added test cases to verify both "should throw" and "should not throw" scenarios. Fixes pytorch#147282 Pull Request resolved: pytorch#160677 Approved by: https://github.com/mlazos, https://github.com/atalman
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos choijon5 [ghstack-poisoned]
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)). Pull Request resolved: #163696 Approved by: https://github.com/mlazos
Updated the DeviceAssert operation to match the behavior of Store, it will fixes the issue mentioned in [this PR](#163023) and updated testcases as Elias [suggested](#160677 (comment)). Pull Request resolved: #163696 Approved by: https://github.com/mlazos
This PR introduces a device_assert op to trigger device-side assertions within torch.compile. This implementation is based on the suggestion in this comment.
Changes Included
Fixes #147282
cc @SherlockNoMad @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos