Skip to content

Conversation

@yf225
Copy link
Contributor

@yf225 yf225 commented Sep 5, 2024

If an auto_functionalized HOP is included in backward graph due to activation checkpointing, we will run into a scenario where Compiled Autograd Dynamo tracing will need to trace through the auto_functionalized HOP. This PR adds support for it.

Test commands:

  • pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_auto_functionalized

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @rec

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 5, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135169

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit c4bff9e with merge base 30b9894 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
@yf225 yf225 marked this pull request as draft September 5, 2024 01:57
@yf225 yf225 added the topic: not user facing topic category label Sep 5, 2024
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
@yf225 yf225 changed the title [WIP][Dynamo] allow tracing through auto_functionalized HOP [Dynamo] allow tracing through auto_functionalized HOP Sep 5, 2024
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang rec

[ghstack-poisoned]
yf225 added a commit that referenced this pull request Sep 5, 2024
@yf225 yf225 marked this pull request as ready for review September 5, 2024 05:11
@yf225 yf225 requested a review from zou3519 September 5, 2024 05:11
@yf225 yf225 changed the title [Dynamo] allow tracing through auto_functionalized HOP [Traceable FSDP2][Dynamo] allow tracing through auto_functionalized HOP Sep 5, 2024
@yf225 yf225 added the keep-going Don't stop on first failure, keep running tests until the end label Sep 5, 2024
@yf225
Copy link
Contributor Author

yf225 commented Sep 5, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 5, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@yf225
Copy link
Contributor Author

yf225 commented Sep 5, 2024

@pytorchbot merge -f "unrelated failure"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
…OP (pytorch#135169)

If an `auto_functionalized` HOP is included in backward graph due to activation checkpointing, we will run into a scenario where Compiled Autograd Dynamo tracing will need to trace through the `auto_functionalized` HOP. This PR adds support for it.

Test commands:
- `pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_auto_functionalized`

Pull Request resolved: pytorch#135169
Approved by: https://github.com/zou3519
@github-actions github-actions bot deleted the gh/yf225/118/head branch October 6, 2024 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request keep-going Don't stop on first failure, keep running tests until the end Merged module: dynamo module: inductor topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants