[6/N][BE] Remove Sharded Linear Op for ShardedTensor#95948
[6/N][BE] Remove Sharded Linear Op for ShardedTensor#95948fduwjj wants to merge 2 commits intogh/fduwjj/78/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/95948
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit a9d6b23: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@fduwjj has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@fduwjj I suspect that this PR causes a failed test in CI periodic multigpu test, for example https://hud.pytorch.org/pytorch/pytorch/commit/893aa5df3f2a475c91ea8eadb1353812e52fb227. The error is Should that test also need to be removed as part of this change to remove linear ops for sharded tensor? Please help take a look |
|
@huydhn I will send a forward fix for that. Not sure why this was not captured in CI flow. |
|
Oh it's in periodic large test, that's no wonder... |
Thank you for looking into this. That test is periodic, so it only runs every 4 hours or so. You could add 'ciflow/periodic` in your PR to have it running. It's ok to forward fix periodic tests though |
We removed ShardedLinear in #95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. [ghstack-poisoned]
…st and other tests depending on Sharded Linear" We removed ShardedLinear in #95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. [ghstack-poisoned]
…r tests depending on Sharded Linear (#96254) We removed ShardedLinear in #95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. Pull Request resolved: #96254 Approved by: https://github.com/huydhn
…r tests depending on Sharded Linear (#96254) We removed ShardedLinear in pytorch/pytorch#95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. Pull Request resolved: pytorch/pytorch#96254 Approved by: https://github.com/huydhn
…r tests depending on Sharded Linear (#96254) We removed ShardedLinear in pytorch/pytorch#95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. Pull Request resolved: pytorch/pytorch#96254 Approved by: https://github.com/huydhn
…r tests depending on Sharded Linear (pytorch#96254) We removed ShardedLinear in pytorch#95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit. Pull Request resolved: pytorch#96254 Approved by: https://github.com/huydhn
Stack from ghstack (oldest at bottom):
Differential Revision: D43877082