Skip to content

Conversation

@rohan-varma
Copy link
Contributor

Per @pietern's comment in #30022, we can make this example launcher a bit simpler by using torch.multiprocessing.

for p in processes:
p.join()
nprocs = 2
mp.spawn(_run_process, args=(i, (i + 1) % 2, file_name), nprocs)
Copy link
Contributor

@mrshenli mrshenli Nov 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should drop the first arg i in the args tuple? Otherwise, the actual args passed to fn would be i, i, (i + 1) % 2?

The function is called as fn(i, *args), where i is the process index and args is the passed through tuple of arguments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, specify nprocs as kwarg (after another kwarg).

.. code::
import multiprocessing as mp
import torch.multiprocessing as mp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this line to after import torch?

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating, @rohan-varma.

for p in processes:
p.join()
nprocs = 2
mp.spawn(_run_process, args=(i, (i + 1) % 2, file_name), nprocs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, specify nprocs as kwarg (after another kwarg).

@rohan-varma
Copy link
Contributor Author

rohan-varma commented Dec 30, 2019

This fell of my queue, but I just updated the PR to address the review comments. @mrshenli @pietern could you take another look?

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks, @rohan-varma.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohan-varma has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohan-varma has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@kostmo
Copy link
Member

kostmo commented Jan 7, 2020

💊 CircleCI build failures summary and remediations

As of commit b29bfad:

  • 1/1 failures introduced in this PR

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

🕵️ 1 new failure recognized by patterns

The following build failures do not appear to be due to upstream breakage:

See CircleCI build pytorch_linux_xenial_cuda9_cudnn7_py3_slow_test (1/1)

Step: "Test" (full log | pattern match details)

Jan 07 01:05:28 [ FAILED ] JitTest.ADFormulas
Jan 07 01:05:28 [ RUN      ] JitTest.ModuleConversion_CUDA 
Jan 07 01:05:28 [       OK ] JitTest.ModuleConversion_CUDA (0 ms) 
Jan 07 01:05:28 [ RUN      ] JitTest.Interp_CUDA 
Jan 07 01:05:28 [       OK ] JitTest.Interp_CUDA (1 ms) 
Jan 07 01:05:28 [----------] 76 tests from JitTest (3723 ms total) 
Jan 07 01:05:28  
Jan 07 01:05:28 [----------] Global test environment tear-down 
Jan 07 01:05:28 [==========] 76 tests from 1 test case ran. (3723 ms total) 
Jan 07 01:05:28 [  PASSED  ] 75 tests. 
Jan 07 01:05:28 [  FAILED  ] 1 test, listed below: 
Jan 07 01:05:28 [  FAILED  ] JitTest.ADFormulas 
Jan 07 01:05:28  
Jan 07 01:05:28  1 FAILED TEST 
Jan 07 01:05:29 + cleanup 
Jan 07 01:05:29 + retcode=1 
Jan 07 01:05:29 + set +x 
Jan 07 01:05:29 =================== sccache compilation log =================== 
Jan 07 01:05:29 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/tmp/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/tmp/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Jan 07 01:05:29  
Jan 07 01:05:29 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jan 07 01:05:29 Compile requests               128 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

@rohan-varma
Copy link
Contributor Author

Failure is unrelated:

Jan 07 01:05:28 [  PASSED  ] 75 tests.
Jan 07 01:05:28 [  FAILED  ] 1 test, listed below:
Jan 07 01:05:28 [  FAILED  ] JitTest.ADFormulas

@facebook-github-bot
Copy link
Contributor

@rohan-varma merged this pull request in a561a84.

wuhuikx pushed a commit to wuhuikx/pytorch that referenced this pull request Jan 30, 2020
Summary:
Per pietern's comment in pytorch#30022, we can make this example launcher a bit simpler by using `torch.multiprocessing`.
Pull Request resolved: pytorch#30381

Differential Revision: D19292080

Pulled By: rohan-varma

fbshipit-source-id: 018ace945601166ef3af05d8c3e69d900bd77c3b
@facebook-github-bot facebook-github-bot deleted the update_dist_autograd_note branch July 13, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants