-
Notifications
You must be signed in to change notification settings - Fork 26.3k
add process_group in convert_sync_batchnorm #19240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In line 508. convert_sync_batchnorm is called recursively to convert the bn to syncbn, thus the process_group also should be pass in the function.
ssnl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
|
Thank you. Is there an easy way to test this case? |
In my opinion, Thus, it might to need to recursively pass the A testcase might be use this fucntion to convert the bn to syncbn in resnet50, and then checkout whether all the 'syncbn' have been assigned a proper value of 'progress_group' . |
|
Thanks. Do you think you would have time to add this test? |
Thanks for replying. Do you mean that adding a testcase which like If that so, I might try it on. |
|
@zhangliliang It would be adding a test method in |
@ssnl Get it. I would try. |
|
I write an example to test the case. Do you consider whether it is a right testcase? @ssnl @ezyang Thanks. |
|
The test looks good to me. If you add a little explanation, like what you wrote in your comment, to the code, that would be perfect. 👍 |
|
@ezyang Do we always have torchvision install in CI? If not, the test should probably use a custom network. |
|
Yeah, we install torchvision, and there are existing tests which use it. Actually, this is probably changing soon cc @fmassa, but for now it shouldn't be a problem. |
…using convert_sync_batchnorm.
|
oh no we didn't include this 1.1! |
|
@pytorchbot rebase this please |
|
@pytorchbot merge this please |
|
Sorry for replying later. These days I occupied by some stuff. |
|
@zhangliliang No worries. It's not your fault. Thanks for your contribution! |
Thanks.~ |
|
@ssnl |
|
I haven't looked at the PR, but we have tests which are conditionally enabled depending on if torchvision is available; go take a look at them and copy the pattern. |
|
It looks like there were more changes since the merge request, I will wait for tests. |
@ezyang |
|
All checks have passed, now. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
In line 508.
convert_sync_batchnorm is called recursively to convert the bn to syncbn, thus the process_group also should be passed in the function.