-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add C10_EMBEDDED to gate ostream usage in Half/BFloat16 #140566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off. Differential Revision: [D65888035](https://our.internmc.facebook.com/intern/diff/D65888035/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140566
Note: Links to docs will display an error until the docs builds have been completed. ❗ 2 Active SEVsThere are 2 currently active SEVs. If your PR is affected, please view them below:
✅ No FailuresAs of commit 43c24d0 with merge base 891ba2e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off. Differential Revision: [D65888035](https://our.internmc.facebook.com/intern/diff/D65888035/) ghstack-source-id: 253352122 Pull Request resolved: #140566
|
This pull request was exported from Phabricator. Differential Revision: D65888035 |
|
existing executorch half/bfloat16 contain |
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off. Differential Revision: [D65888035](https://our.internmc.facebook.com/intern/diff/D65888035/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D65888035 |
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off. Differential Revision: [D65888035](https://our.internmc.facebook.com/intern/diff/D65888035/) [ghstack-poisoned]
|
This pull request was exported from Phabricator. Differential Revision: D65888035 |
Accessing the inactive member of a union is undefined behavior. Fortunately, we have c10::bit_cast. Differential Revision: [D65888680](https://our.internmc.facebook.com/intern/diff/D65888680/) Pull Request resolved: #140567 Approved by: https://github.com/Skylion007, https://github.com/malfet ghstack dependencies: #140564, #140565, #140566
This was added in pytorch/executorch#1789 . I'm working on sharing Half.h with ExecuTorch, and this is a missing feature. Differential Revision: [D65949409](https://our.internmc.facebook.com/intern/diff/D65949409/) Pull Request resolved: #140720 Approved by: https://github.com/malfet ghstack dependencies: #140564, #140565, #140566, #140567
|
This pull request was exported from Phabricator. Differential Revision: D65888035 |
|
This pull request was exported from Phabricator. Differential Revision: D65888035 |
It passed PyTorch CI, but internally we saw failures from this. Differential Revision: [D66137897](https://our.internmc.facebook.com/intern/diff/D66137897/) Pull Request resolved: #140994 Approved by: https://github.com/malfet ghstack dependencies: #140564, #140565, #140566, #140567, #140720
Make what we're doing as obvious as possible to the compiler. Differential Revision: [D66108811](https://our.internmc.facebook.com/intern/diff/D66108811/) Pull Request resolved: #141035 Approved by: https://github.com/Skylion007, https://github.com/ezyang, https://github.com/malfet ghstack dependencies: #140564, #140565, #140566, #140567, #140720, #140994
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off. Differential Revision: [D65888035](https://our.internmc.facebook.com/intern/diff/D65888035/) Pull Request resolved: pytorch#140566 Approved by: https://github.com/ezyang, https://github.com/malfet ghstack dependencies: pytorch#140564, pytorch#140565
…140567) Accessing the inactive member of a union is undefined behavior. Fortunately, we have c10::bit_cast. Differential Revision: [D65888680](https://our.internmc.facebook.com/intern/diff/D65888680/) Pull Request resolved: pytorch#140567 Approved by: https://github.com/Skylion007, https://github.com/malfet ghstack dependencies: pytorch#140564, pytorch#140565, pytorch#140566
This was added in pytorch/executorch#1789 . I'm working on sharing Half.h with ExecuTorch, and this is a missing feature. Differential Revision: [D65949409](https://our.internmc.facebook.com/intern/diff/D65949409/) Pull Request resolved: pytorch#140720 Approved by: https://github.com/malfet ghstack dependencies: pytorch#140564, pytorch#140565, pytorch#140566, pytorch#140567
It passed PyTorch CI, but internally we saw failures from this. Differential Revision: [D66137897](https://our.internmc.facebook.com/intern/diff/D66137897/) Pull Request resolved: pytorch#140994 Approved by: https://github.com/malfet ghstack dependencies: pytorch#140564, pytorch#140565, pytorch#140566, pytorch#140567, pytorch#140720
Make what we're doing as obvious as possible to the compiler. Differential Revision: [D66108811](https://our.internmc.facebook.com/intern/diff/D66108811/) Pull Request resolved: pytorch#141035 Approved by: https://github.com/Skylion007, https://github.com/ezyang, https://github.com/malfet ghstack dependencies: pytorch#140564, pytorch#140565, pytorch#140566, pytorch#140567, pytorch#140720, pytorch#140994
Stack from ghstack (oldest at bottom):
We want to use Half/BFloat16 in ExecuTorch to support shared kernel code. They will need to be used in ExecuTorch core, so they can't have streams. This diff introduces a macro to gate the stream code off.
Differential Revision: D65888035