Make test_torchbind.py training IR compatible #138658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

tugsbayasgalan wants to merge 12 commits into gh/tugsbayasgalan/267/base from gh/tugsbayasgalan/267/head

Contributor

tugsbayasgalan commented Oct 23, 2024 •

edited

Loading

Stack from ghstack (oldest at bottom):

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now. I also fixed two bugs:

ep.module() doesn't register all aliased constants in the module.
When we retrace, we need to fakify the original Torchbind object.
We don't run any DCE on training IR so we need to add some more torch ops to verifier.

Differential Revision: D64853530


          Make test_torchbind.py training IR compatible

1e8dd13

[ghstack-poisoned]

tugsbayasgalan requested review from angelayi, avikchaudhuri, ydwu4 and zhxchen17 as code owners

October 23, 2024 00:08

This was referenced Oct 23, 2024

Move test_serialize to training IR #138261

Closed

Wrap autograd and autocast ops in training IR #138516

Closed

Make test_export training IR compatible #138517

Closed

pytorch-bot bot commented Oct 23, 2024 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138658

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8b114b1 with merge base 924e726 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / linux-focal-rocm6.2-py3.10 / test (distributed, 1, 1, linux.rocm.gpu) (gh) (disabled by #139591)
distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_restart_pg

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added the release notes: export label

tugsbayasgalan added a commit that referenced this pull request


          Make test_torchbind.py training IR compatible

1bf0b0d

ghstack-source-id: d047eda
Pull Request resolved: #138658

ydwu4 approved these changes

View reviewed changes

test/export/test_torchbind.py Outdated

    
                  return (getitem, sin)""",  # noqa: B950

                          )

                  @unittest.expectedFailure  # T205481814

Contributor

ydwu4 Oct 23, 2024

Not sure what's going on here. Other parts looks good.


          Update on "Make test_torchbind.py training IR compatible"

62142f9

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object. 




[ghstack-poisoned]

tugsbayasgalan added a commit that referenced this pull request


          Make test_torchbind.py training IR compatible

50d2ce8

ghstack-source-id: 0a2c7df
Pull Request resolved: #138658


          Update on "Make test_torchbind.py training IR compatible"

f334941

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object. 




[ghstack-poisoned]

tugsbayasgalan added a commit that referenced this pull request


          Make test_torchbind.py training IR compatible

9430c8b

ghstack-source-id: 5a294cc
Pull Request resolved: #138658

Contributor Author

tugsbayasgalan commented Oct 23, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pytorch-bot bot added the ciflow/trunk label

tugsbayasgalan mentioned this pull request

Fix custom obj being treated as input in export #138749

Closed

Contributor Author

tugsbayasgalan commented Oct 23, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tugsbayasgalan mentioned this pull request

Move pippy to training IR #138923

Closed

Contributor Author

tugsbayasgalan commented Oct 25, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment

Contributor Author

tugsbayasgalan commented Oct 25, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          Update on "Make test_torchbind.py training IR compatible"

136c309

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]

tugsbayasgalan added a commit that referenced this pull request


          Make test_torchbind.py training IR compatible

063e32e

ghstack-source-id: 1e298f5
Pull Request resolved: #138658

Contributor Author

tugsbayasgalan commented Oct 29, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tugsbayasgalan mentioned this pull request

Fix custom obj being input #139209

Closed

Contributor Author

tugsbayasgalan commented Oct 29, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment

Contributor Author

tugsbayasgalan commented Oct 29, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tugsbayasgalan mentioned this pull request

Move pippy to training IR #139233

Closed

tugsbayasgalan mentioned this pull request

Change export IR to non-functional pre-dispatch IR #139511

Closed

tugsbayasgalan added 2 commits

November 1, 2024 13:50


          Update on "Make test_torchbind.py training IR compatible"

dcf0f87

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]


          Update on "Make test_torchbind.py training IR compatible"

90c5e46

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]

Contributor Author

tugsbayasgalan commented Nov 1, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pianpwk reviewed

View reviewed changes

torch/export/_trace.py Outdated

    
                  from torch._export.verifier import TrainingIRVerifier

                  print("GRAPH", gm.graph)

Contributor

pianpwk Nov 1, 2024

nit


          Update on "Make test_torchbind.py training IR compatible"

6f161f3

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]

Contributor Author

tugsbayasgalan commented Nov 1, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tugsbayasgalan added 2 commits

November 2, 2024 19:55


          Update on "Make test_torchbind.py training IR compatible"

eb7fb48

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]


          Update on "Make test_torchbind.py training IR compatible"

7e83c95

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]

Contributor Author

tugsbayasgalan commented Nov 3, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          Update on "Make test_torchbind.py training IR compatible"

8b114b1

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs: 
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to the serializer. 

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)

[ghstack-poisoned]

tugsbayasgalan requested a review from zhxchen17

November 3, 2024 23:46

Contributor Author

tugsbayasgalan commented Nov 3, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment

Contributor Author

tugsbayasgalan commented Nov 3, 2024

@tugsbayasgalan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

tugsbayasgalan commented

View reviewed changes

test/export/test_export.py

    
                      # as input, runtime assertion should fail. This is because we would create

                      # guard on y.shape[0] > x.shape[0] but somehow in old export, we dce this

                      # assertion.

                      if is_training_ir_test(self._testMethodName) and is_non_strict_test(

Contributor Author

tugsbayasgalan Nov 3, 2024

zhxchen17 approved these changes

View reviewed changes

Contributor

facebook-github-bot commented Nov 4, 2024

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Nov 4, 2024

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

e080c89

pytorchmergebot removed the merging label

pytorchmergebot pushed a commit that referenced this pull request


          Fix custom obj being input (#139209)

ae0e704

Differential Revision: [D65158939](https://our.internmc.facebook.com/intern/diff/D65158939)
Pull Request resolved: #139209
Approved by: https://github.com/ydwu4
ghstack dependencies: #138658

pytorchmergebot pushed a commit that referenced this pull request


          Move pippy to training IR (#139233)

87a379b

Differential Revision: [D65282662](https://our.internmc.facebook.com/intern/diff/D65282662)
Pull Request resolved: #139233
Approved by: https://github.com/kwen2501
ghstack dependencies: #138658, #139209

Ryo-not-rio pushed a commit to Ryo-not-rio/pytorch that referenced this pull request


          Make test_torchbind.py training IR compatible (pytorch#138658)

1930fa8

In this diff, i make test_torchbind.py tests to handle training IR. Today in the training IR, we don't see the effect token and HOP because this happens at the FunctionalTensorMode. Maybe in the future, we should move this logic up to the training IR so that writing passes etc on training Ir is safer. But for the migration purposes, i think it is ok for now.  I also fixed two bugs:
1. ep.module() doesn't register all aliased constants in the module.
2. When we retrace, we need to fakify the original Torchbind object.
3. We don't run any DCE on training IR so we need to add some more torch ops to verifier.

Differential Revision: [D64853530](https://our.internmc.facebook.com/intern/diff/D64853530)
Pull Request resolved: pytorch#138658
Approved by: https://github.com/ydwu4, https://github.com/zhxchen17

Ryo-not-rio pushed a commit to Ryo-not-rio/pytorch that referenced this pull request


          Fix custom obj being input (pytorch#139209)

18279b3

Differential Revision: [D65158939](https://our.internmc.facebook.com/intern/diff/D65158939)
Pull Request resolved: pytorch#139209
Approved by: https://github.com/ydwu4
ghstack dependencies: pytorch#138658

Ryo-not-rio pushed a commit to Ryo-not-rio/pytorch that referenced this pull request


          Move pippy to training IR (pytorch#139233)

b04c936

Differential Revision: [D65282662](https://our.internmc.facebook.com/intern/diff/D65282662)
Pull Request resolved: pytorch#139233
Approved by: https://github.com/kwen2501
ghstack dependencies: pytorch#138658, pytorch#139209

github-actions bot deleted the gh/tugsbayasgalan/267/head branch

December 5, 2024 02:15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Merged release notes: export