-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[test][easy] Add debug utils for cpu select algorithm test #135038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135038
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 2fa2703 with merge base 04118d8 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D62005445 |
|
@jgong5 can you take a look? |
jgong5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Better to wrap it with is_fbcode() check though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is only used for debugging fbcode thing, can we wrap it with is_fbcode() check?
|
This pull request was exported from Phabricator. Differential Revision: D62005445 |
bda4716 to
40b6195
Compare
…35038) Summary: Pull Request resolved: pytorch#135038 Add debug utils to debug a flaky test in fbcode ci. Test Plan: ci Differential Revision: D62005445
…35038) Summary: Pull Request resolved: pytorch#135038 Add debug utils to debug a flaky test in fbcode ci. Test Plan: ci ``` buck2 test -j 18 'fbcode//mode/opt' fbcode//caffe2/test/inductor:cpu_select_algorithm_cpu -- --exact 'caffe2/test/inductor:cpu_select_algorithm_cpu - test_linear_with_pointwise_batch_size_384_in_features_196_out_features_384_bias_False_epilogue_sigmoid_cpu_bfloat16 (caffe2.test.inductor.test_cpu_select_algorithm.TestSelectAlgorithmCPU)' --run-disabled --stress-runs 10 --record-results --print-passing-details ``` Differential Revision: D62005445
|
This pull request was exported from Phabricator. Differential Revision: D62005445 |
40b6195 to
2fa2703
Compare
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 0 checks: Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
It looks like in some cases we have Quickest way forward might be to add that check in the if conditions. |
Correction: |
…35038) Summary: Add debug utils to debug a flaky test in fbcode ci. Some context: pytorch#126545 Test Plan: ci Differential Revision: D62005445 Pull Request resolved: pytorch#135038 Approved by: https://github.com/jgong5, https://github.com/XuehaiPan
…136290) Sometimes the test is run with older cpu, e.g. Intel(R) Xeon(R) CPU E5-2680 v4. If we inspect its `lscpu`, in the flags, we don't see a `avx512_bf16`. So that probably means bf16 is not supported for those hardwares, and hence the unit test can fail. So we add the check in the code. Context: #135038 Differential Revision: D62984129 Pull Request resolved: #136290 Approved by: https://github.com/XuehaiPan, https://github.com/chenyang78
…ytorch#136290) Sometimes the test is run with older cpu, e.g. Intel(R) Xeon(R) CPU E5-2680 v4. If we inspect its `lscpu`, in the flags, we don't see a `avx512_bf16`. So that probably means bf16 is not supported for those hardwares, and hence the unit test can fail. So we add the check in the code. Context: pytorch#135038 Differential Revision: D62984129 Pull Request resolved: pytorch#136290 Approved by: https://github.com/XuehaiPan, https://github.com/chenyang78
…ytorch#136290) Sometimes the test is run with older cpu, e.g. Intel(R) Xeon(R) CPU E5-2680 v4. If we inspect its `lscpu`, in the flags, we don't see a `avx512_bf16`. So that probably means bf16 is not supported for those hardwares, and hence the unit test can fail. So we add the check in the code. Context: pytorch#135038 Differential Revision: D62984129 Pull Request resolved: pytorch#136290 Approved by: https://github.com/XuehaiPan, https://github.com/chenyang78
Summary: Add debug utils to debug a flaky test in fbcode ci.
Some context: #126545
Test Plan: ci
Differential Revision: D62005445
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang