-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[test stats] use published test stats for sharding #81116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
🔗 Helpful links
❌ 1 New FailuresAs of commit b5c388a (more details on the Dr. CI page): Expand to see more
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages
|
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. [ghstack-poisoned]
janeyx99
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, but could you verify that the same tests were run as before in the stats before landing?
Brief glance at the logs also looks good.
|
It should have been better documented, but the reason why test sharding is preserved at the build time, is to avoid a situation when test suite will only partially be run (or some tests would be run twice) as result of network flakiness After you've removed this logic, its possible that some shards would build the schedule based on latest nightly stats, while others will use default ones (if S3 is unavailable for some reason) To avoid that, either submit a follow-up fix, or restore the logic |
Summary: Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. Pull Request resolved: #81116 Approved by: https://github.com/janeyx99 Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/9f58d5d7ce7d2779f08b8fa8cbbb52ba1bf153af Reviewed By: DanilBaibak Differential Revision: D37782040 Pulled By: suo fbshipit-source-id: b50e2efe8f26fbc05361d815625921571d2ae5d3
|
Wait but the test jsons preserved at build time are downloaded to the test
jobs…from s3 lol
…On Tue, Jul 12, 2022 at 6:41 AM Nikita Shulga ***@***.***> wrote:
It should have been better documented, but the reason why test sharding is
preserved at the build time, is to avoid a situation when test suite will
only partially be run (or some tests would be run twice) as result of
network flakiness
After you've removed this logic, its possible that some shards would build
the schedule based on latest nightly stats, while others will use default
ones (if S3 is unavailable for some reason)
To avoid that, either submit a follow-up fix, or restore the logic
—
Reply to this email directly, view it on GitHub
<#81116 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMK4EGU57WWGMUFOAJVZBTVTVYYDANCNFSM53BS5IYQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
After #81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version. This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident). This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day. In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts. If there is no test times file in the build job's artifacts, we fall back to the default sharding plan. Notes: * script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan * I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build ### Test Plan Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master. Checked logs to see if artifacts were being used for sharding. Spot checked a few test configs to check that their lists of selected tests didn't overlap. Pull Request resolved: #81915 Approved by: https://github.com/huydhn
After #81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version. This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident). This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day. In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts. If there is no test times file in the build job's artifacts, we fall back to the default sharding plan. Notes: * script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan * I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build ### Test Plan Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master. Checked logs to see if artifacts were being used for sharding. Spot checked a few test configs to check that their lists of selected tests didn't overlap. Pull Request resolved: #81915 Approved by: https://github.com/huydhn
…81915) Summary: After #81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version. This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident). This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day. In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts. If there is no test times file in the build job's artifacts, we fall back to the default sharding plan. Notes: * script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan * I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build ### Test Plan Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master. Checked logs to see if artifacts were being used for sharding. Spot checked a few test configs to check that their lists of selected tests didn't overlap. Pull Request resolved: #81915 Approved by: https://github.com/huydhn Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/86f038dd56dab9ecec5893b60efc74e46ca19e36 Reviewed By: osalpekar Differential Revision: D38252585 Pulled By: clee2000 fbshipit-source-id: 912b5fa0977647a79785e24613355ff0879bcacf

Stack from ghstack (oldest at bottom):
Use the nightly-published test stats to perform sharding, instead of
calculating it in every build job.