-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add facility to run dynamo UTs for non-cuda devices #140929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140929
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 1 Pending, 1 Unrelated FailureAs of commit a23bf65 with merge base 0c05832 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
b2b4103 to
c5998de
Compare
c5998de to
fc10fc7
Compare
|
@anijain2305 : Can you please help with the review and approval , this is in lines of #130714 |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
fc10fc7 to
00ba589
Compare
✅ Deploy Preview for chimerical-cranachan-793287 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
364e8c1 to
9086f92
Compare
|
@anijain2305 : Gentle reminder, can you please help with the approval. Thanks ! |
|
@anijain2305 : Can you please help with the approval ? thanks |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command |
c3fcbcd to
78e9f6a
Compare
2abfef3 to
6fa059a
Compare
|
Hello @kwen2501 , @guangyey , @anijain2305 , @albanD, @EikanWang : can anyone of you please help with this approval, its been pending for long. Thank you. |
kwen2501
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @anijain2305 @angelayi @ydwu4 @tugsbayasgalan please review to confirm, thanks!
EikanWang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it is fine to add device information to the test cases for flexibility. However, the only_for of instantiate_device_type_tests does not align with the original semantics. It does not sever other devices. For example, it cannot support privateuse1 backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
test/dynamo/test_backends.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
In terms of the landed #130714, it introduced the similar issue. We need to track it and refine the test case by removing |
These were running for "only cuda" till now, if other accelerators want the support they can verify and add on to the list |
|
@EikanWang : Note that these tests were written only for cuda , the PR here removes that and generalizes it for other accelerators, if other accelerators support the TC, they can add on to the list. hope the intent is clear |
|
Per my understanding, it is HPU-bias. Let's focus on the HPU enabling rather than limiting the scope
Actually, not. For example, https://github.com/pytorch/pytorch/blob/main/test/dynamo/test_activation_checkpointing.py#L1116 works for CPU as well. why does this PR limit the case to CUDA? |
|
@EikanWang : The CPU TC related comment is a "valid" one and will be addressed but the general device related comment is unacceptable . This is towards a goal for device abstraction in the UTs. The scheme is already followed in all ops related modules eg : test_ops.py and most recently added for FSDP and D tensor as well. I would request @albanD to moderate here. Thanks |
@ankurneog , I agreed with that to use
|
With respect to editing the semantics to enable cpu tests, we can simply maintain two lists which are passed to different classes as and when required. Instantiate device type test will automatically run tests for all devices present in the list passed to it (only_for) and hence gives us the flexibility to test for cases which require both multi device tests or for single ones. One way this can be approached is as done in #138216, it would be great to discuss better ways to approach this as well.
I agree that we should remove requires_cuda as this makes the test cuda specific |
@EikanWang , @guangyey : Please see my comment on the reason for inclusion of |
6fa059a to
a23bf65
Compare
|
@ankurneog , I just run the CI. Let's wait for the CI signal. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This is in line with changes introduced with #130714, additional files are included to support non-cuda devices.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames