Skip to content

Conversation

@rohan-varma
Copy link
Contributor

@rohan-varma rohan-varma commented Nov 6, 2019

Closes #28983. Documentation for torch.distributed.rpc and torch.distributed.autograd modules. Also fixes/tidies up some of the docstrings in rpc/autograd, and moves some functions to be private so they don't show up in the documentation.

Note: Much of the text to describe/explain the RPC/RRef layers are taken from the following RFCs: #23110, #26759

@rohan-varma rohan-varma added module: rpc Related to RPC, distributed autograd, RRef, and distributed optimizer module: docs Related to our documentation, both in docs/ and docblocks labels Nov 6, 2019
.. automodule:: torch.distributed.rpc
.. currentmodule:: torch.distributed.rpc

.. autofunction:: rpc_sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to automatically pull in all public methods of a module instead of specifying each one like this?

Copy link
Contributor Author

@rohan-varma rohan-varma Nov 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pritamdamania87 it looks like other doc-related commits list out each public method, see for example 8915e27#diff-7aa5f4b7f9dedcb7a6ce129b9d3c5eb8. The sphix documentation (http://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) mentions we can recursively include members of a class/module, which I've updated this PR to use.

@@ -0,0 +1,11 @@
.. role:: hidden
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this hidden on purpose? I think it should be fine to not hide this for now since it'll only be in master and won't show up in some stable docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep them hidden if the docs aren't polished completely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also in response to similar question from @mrshenli)
As @pritamdamania mentioned, it's fine to have unfinished docs out in master; however, tagging them as hidden can be a helpful way to keep track of what still needs work prior to the final release.

.. role:: hidden
:class: hidden-section

Distributed RPC Framework - torch.distributed.autograd and torch.distributed.rpc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only Distributed RPC Framework should be sufficient, otherwise the entire title is too long for the title.

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Hey @jlin27, should we keep this hidden for now as we haven't finished polishing the docstrings yet?

RPC and RRef Framework
====================================================

Bbasics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basics

Returns the result of running ``func`` on ``args`` and ``kwargs``.
Example::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: why do we need the new line here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this makes the code appear nicely and in python format on the documentation page

@mrshenli mrshenli requested a review from jlin27 November 11, 2019 18:12
@rohan-varma rohan-varma changed the title [WIP][docs] distributed framework docs [docs] distributed framework docs Nov 11, 2019
@rohan-varma rohan-varma changed the title [docs] distributed framework docs [docs] Add docs for RPC, dist autograd, and RRef modules Nov 11, 2019
Before using RPC and distributed autograd primitives, initialization must take place. First, a backend over which RPCs can be sent over must be initialized. The default (and currently, only available) implementation is the `ProcessGroup` backend, and must be initialized with `torch.distributed.init_process_group` before using other function. See the `documentation for torch.distributed <https://pytorch.org/docs/stable/distributed.html>`_ for additional details. Next, the local RPC agent can be initialized, after which the process will be able to send and receive RPCs from all other connected processes.

.. automodule:: torch.distributed.rpc
:members:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pritamdamania87 this will add all members of this module, but we'd want to be careful in using this since non-polished/private functions could show up.

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content LGTM! Please also get a stamp from @jlin27 on whether we should hide this for now and whether the structure is OK. Thanks!

Distributed RPC Framework
=====================================================

The distributed RPC framework provides mechanisms for multi-machine model training through a set of primitives to allow for remote communication, and a higher-level API to automatically differentiate models split across several machines.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have everything on a single long line? Can we split the lines in this file? Looks like most doc files split lines after 80 chars.

@rohan-varma rohan-varma changed the title [docs] Add docs for RPC, dist autograd, and RRef modules Add docs for RPC, dist autograd, and RRef modules Nov 12, 2019
@rohan-varma
Copy link
Contributor Author

@pritamdamania87 All the comments are addressed, could you take another look? Thanks!

Copy link
Contributor

@pritamdamania87 pritamdamania87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, although please check why the RRef class doesn't show up and see if we can fix it before landing.

@rohan-varma
Copy link
Contributor Author

@jlin27 Could you please take a look at the structure and whether it should be hidden before I land this? Thanks!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohan-varma has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohan-varma has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

.. role:: hidden
:class: hidden-section

Distributed RPC Framework
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mark this as experimental for now. Check out this page as an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do this in follow up PR

@rohan-varma
Copy link
Contributor Author

rohan-varma commented Nov 14, 2019

All of the failures are pre existing and not related to this PR. Proceeding with landing.

@facebook-github-bot
Copy link
Contributor

@rohan-varma merged this pull request in 06ef4a7.

csarofeen pushed a commit to mruberry/pytorch that referenced this pull request Nov 18, 2019
Summary:
Closes pytorch#28983. Documentation for `torch.distributed.rpc` and `torch.distributed.autograd` modules. Also fixes/tidies up some of the docstrings in rpc/autograd, and moves some functions to be private so they don't show up in the documentation.

Note: Much of the text to describe/explain the RPC/RRef layers are taken from the following RFCs: pytorch#23110, pytorch#26759
Pull Request resolved: pytorch#29276

Differential Revision: D18478754

Pulled By: rohan-varma

fbshipit-source-id: e9a7089baf5275304e5408d319eb9bf98e53fff8
@facebook-github-bot facebook-github-bot deleted the rpc_autograd_docs branch July 13, 2020 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: docs Related to our documentation, both in docs/ and docblocks module: rpc Related to RPC, distributed autograd, RRef, and distributed optimizer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add document page for torch.distributed.rpc/autograd

7 participants