Skip to content

Conversation

@satgera
Copy link
Contributor

@satgera satgera commented Sep 19, 2019

Stack from ghstack:

[pytorch] [distributed] Corrected variable name and added test

Differential Revision: D17488846

@pytorchbot pytorchbot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Sep 19, 2019
satgera added a commit that referenced this pull request Sep 19, 2019
[pytorch] [distributed] Corrected variable name and added test

Differential Revision: [D17488846](https://our.internmc.facebook.com/intern/diff/D17488846/)

ghstack-source-id: 90454793
Pull Request resolved: #26503
# TODO: add try-except and destroy _agent in all processes if any fails.
_agent = ProcessGroupAgent(self_name, group, num_send_recv_threads)
init_rref_context(_agent)
elif is_backend_registered(rpc_backend):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this code covered by tests earlier? Why didn't we see some sort of failure since rpc_backend wasn't initialized anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't covered by any tests, just added one for invalid backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, didn't spot this in last review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't feel bad Shen, it was my diff, I am too used to letting c/c++ compiler catching undeclared variables, getting used to python.

# TODO: add try-except and destroy _agent in all processes if any fails.
_agent = ProcessGroupAgent(self_name, group, num_send_recv_threads)
init_rref_context(_agent)
elif is_backend_registered(rpc_backend):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, didn't spot this in last review.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in b401e9d.

mingbowan pushed a commit to mingbowan/pytorch that referenced this pull request Sep 23, 2019
Summary:
Pull Request resolved: pytorch#26503

[pytorch] [distributed] Corrected variable name and added test
ghstack-source-id: 90454793

Test Plan: Made sure pg based UT works.

Differential Revision: D17488846

fbshipit-source-id: 6e6cba110a6f61ee1af3d37c5a41c69701de1a8b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants