Skip to content

Conversation

@ankurneog
Copy link

@ankurneog ankurneog commented Nov 18, 2024

This is in line with changes introduced with #130714, additional files are included to support non-cuda devices.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140929

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 1 Unrelated Failure

As of commit a23bf65 with merge base 0c05832 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@ankurneog
Copy link
Author

@anijain2305 : Can you please help with the review and approval , this is in lines of #130714
Thanks

@bdhirsh bdhirsh requested a review from anijain2305 November 26, 2024 00:24
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 26, 2024
@ankurneog
Copy link
Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased dynamo_backend_hpu onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout dynamo_backend_hpu && git pull --rebase)

@netlify
Copy link

netlify bot commented Dec 6, 2024

Deploy Preview for chimerical-cranachan-793287 ready!

Name Link
🔨 Latest commit 364e8c19eb73d276ed6ba91ea9d7282a3eb4f923
🔍 Latest deploy log https://app.netlify.com/sites/chimerical-cranachan-793287/deploys/6752882b5a95220008248e49
😎 Deploy Preview https://deploy-preview-140929--chimerical-cranachan-793287.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@ankurneog
Copy link
Author

@anijain2305 : Gentle reminder, can you please help with the approval. Thanks !

@ankurneog
Copy link
Author

@anijain2305 : Can you please help with the approval ? thanks

@ankurneog
Copy link
Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/140929/head returned non-zero exit code 1

Rebasing (1/4)
Auto-merging test/dynamo/test_activation_checkpointing.py
Auto-merging test/dynamo/test_backends.py
Auto-merging test/dynamo/test_export.py
Auto-merging test/dynamo/test_modules.py
CONFLICT (content): Merge conflict in test/dynamo/test_modules.py
error: could not apply 00ba5897923... Add facility to run dynamo UTs for non-cuda devices
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Could not apply 00ba5897923... Add facility to run dynamo UTs for non-cuda devices

Raised by https://github.com/pytorch/pytorch/actions/runs/12390920075

@ankurneog
Copy link
Author

ankurneog commented Jan 17, 2025

Hello @kwen2501 , @guangyey , @anijain2305 , @albanD, @EikanWang : can anyone of you please help with this approval, its been pending for long. Thank you.

Copy link
Collaborator

@kwen2501 kwen2501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @anijain2305 @angelayi @ydwu4 @tugsbayasgalan please review to confirm, thanks!

@kwen2501 kwen2501 requested review from angelayi and ydwu4 January 17, 2025 07:05
Copy link
Collaborator

@EikanWang EikanWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it is fine to add device information to the test cases for flexibility. However, the only_for of instantiate_device_type_tests does not align with the original semantics. It does not sever other devices. For example, it cannot support privateuse1 backend

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines +397 to +399
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@EikanWang
Copy link
Collaborator

In terms of the landed #130714, it introduced the similar issue. We need to track it and refine the test case by removing only_for or extending only_for.

@ankurneog
Copy link
Author

In terms of the landed #130714, it introduced the similar issue. We need to track it and refine the test case by removing only_for or extending only_for.

These were running for "only cuda" till now, if other accelerators want the support they can verify and add on to the list

@ankurneog
Copy link
Author

@EikanWang : Note that these tests were written only for cuda , the PR here removes that and generalizes it for other accelerators, if other accelerators support the TC, they can add on to the list. hope the intent is clear

@EikanWang
Copy link
Collaborator

EikanWang commented Jan 17, 2025

Per my understanding, it is HPU-bias. Let's focus on the HPU enabling rather than limiting the scope cuda and hpu.

@EikanWang : Note that these tests were written only for cuda , the PR here removes that and generalizes it for other accelerators, if other accelerators support the TC, they can add on to the list. hope the intent is clear

Actually, not. For example, https://github.com/pytorch/pytorch/blob/main/test/dynamo/test_activation_checkpointing.py#L1116 works for CPU as well. why does this PR limit the case to CUDA?

@ankurneog
Copy link
Author

@EikanWang : The CPU TC related comment is a "valid" one and will be addressed but the general device related comment is unacceptable . This is towards a goal for device abstraction in the UTs. The scheme is already followed in all ops related modules eg : test_ops.py and most recently added for FSDP and D tensor as well. I would request @albanD to moderate here. Thanks

@EikanWang
Copy link
Collaborator

EikanWang commented Jan 17, 2025

@EikanWang : The CPU TC related comment is a "valid" one and will be addressed but the general device related comment is unacceptable . This is towards a goal for device abstraction in the UTs. The scheme is already followed in all ops related modules eg : test_ops.py and most recently added for FSDP and D tensor as well. I would request @albanD to moderate here. Thanks

@ankurneog , I agreed with that to use instantiate_device_type_tests to support different devices. My point is only_for may not align with the original semantics.

  • only_for("cuda", "hpu") indicates that all the test cases work on cuda and hpu only. However, some test cases intend to test CPU, like test_compile_selective_checkpoint_parametrization and test_compile_selective_checkpoint_invalid_context
  • only_for("cuda", "hpu") intends to enable cuda and hpu. However, most of the test cases are decorated as require_cuda. hpu should not work as expected as most test cases will be skipped.

@AnantGulati
Copy link
Contributor

@EikanWang

  • only_for("cuda", "hpu") indicates that all the test cases work on cuda and hpu only. However, some test cases intend to test CPU, like test_compile_selective_checkpoint_parametrization and test_compile_selective_checkpoint_invalid_context

With respect to editing the semantics to enable cpu tests, we can simply maintain two lists which are passed to different classes as and when required.

Instantiate device type test will automatically run tests for all devices present in the list passed to it (only_for) and hence gives us the flexibility to test for cases which require both multi device tests or for single ones.

One way this can be approached is as done in #138216, it would be great to discuss better ways to approach this as well.

  • only_for("cuda", "hpu") intends to enable cuda and hpu. However, most of the test cases are decorated as require_cuda. hpu should not work as expected as most test cases will be skipped.

I agree that we should remove requires_cuda as this makes the test cuda specific

@ankurneog
Copy link
Author

@EikanWang : The CPU TC related comment is a "valid" one and will be addressed but the general device related comment is unacceptable . This is towards a goal for device abstraction in the UTs. The scheme is already followed in all ops related modules eg : test_ops.py and most recently added for FSDP and D tensor as well. I would request @albanD to moderate here. Thanks

@ankurneog , I agreed with that to use instantiate_device_type_tests to support different devices. My point is only_for may not align with the original semantics.

  • only_for("cuda", "hpu") skips the test cases like test_compile_selective_checkpoint_parametrization and test_compile_selective_checkpoint_invalid_context
  • only_for("cuda", "hpu") intends to enable cuda and hpu. However, most of the test cases are decorated as require_cuda. hpu should not work as expected as most test cases will be skipped.

@EikanWang , @guangyey : Please see my comment on the reason for inclusion of @requires_cuda , the decorator ultimately checks for Triton capable NVIDIA HW, ( this needs to be decoupled eventually) . Hence the CI fails.
we can clean this up later after modifying the whole logic for requires_cuda. in the meantime we can leave this as such, since the original functionality is unchanged.

@EikanWang
Copy link
Collaborator

@ankurneog , I just run the CI. Let's wait for the CI signal.

@ankurneog
Copy link
Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 20, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@ankurneog ankurneog deleted the dynamo_backend_hpu branch January 21, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants