Skip to content

Conversation

@zhaojuanmao
Copy link
Contributor

@zhaojuanmao zhaojuanmao commented Nov 14, 2019

Stack from ghstack:

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: D18507384

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)

[ghstack-poisoned]
zhaojuanmao added a commit that referenced this pull request Nov 14, 2019
There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)

ghstack-source-id: 93919578
Pull Request resolved: #29827
TESTS.extend([
'rpc_fork',
'rpc_spawn',
'dist_autograd_fork',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do the same to test_dist_optimizer_fork.py. It depends on RpcAgent as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

if PY33:
TESTS.extend([
'rpc_fork',
'rpc_spawn',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we only have spawn mode on the OSS side, do we still need to explicitly have a "spawn" in the file name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep it for now? once we completely get rid of fork, we can merge rpcSpawnTest with RpcTest

…repo"

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)

[ghstack-poisoned]
zhaojuanmao added a commit that referenced this pull request Nov 15, 2019
Pull Request resolved: #29827

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)
ghstack-source-id: 94040608
Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint and test failures are irrelevant.

Curious, why dist_optimizer_spawn is not in windows and rocm blacklist?

@zhaojuanmao
Copy link
Contributor Author

I just saw @xush6528 added dist_optim_spawn to blacklist in #29747

…repo"

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)

[ghstack-poisoned]
…repo"

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)

[ghstack-poisoned]
zhaojuanmao added a commit that referenced this pull request Nov 18, 2019
Pull Request resolved: #29827

There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running.

Differential Revision: [D18507384](https://our.internmc.facebook.com/intern/diff/D18507384/)
ghstack-source-id: 94139812
@zhaojuanmao
Copy link
Contributor Author

checked "rpc_fork", "dist_autograd_fork" and "dist_optimizer_fork" did not run but "rpc_spawn", "dist_autograd_spawn" and "dist_optimizer_spawn" ran in https://circleci.com/api/v1.1/project/github/pytorch/pytorch/3645637/output/105/0?file=true&allocation-id=5dd30d9f0d6a0b62d1d857f9-0-build%2F10D9680D

@zhaojuanmao
Copy link
Contributor Author

rocm test failure is not relevant

FAILED (skipped=12, unexpected successes=1)
21:57:37 Traceback (most recent call last):
21:57:37 File "test/run_test.py", line 448, in
21:57:37 main()
21:57:37 File "test/run_test.py", line 441, in main
21:57:37 raise RuntimeError(message)
21:57:37 RuntimeError: test_cuda failed!
21:57:37 + cleanup
21:57:37 + retcode=1
21:57:37 + set +x

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 861ef05.

mrshenli added a commit that referenced this pull request Nov 19, 2019
As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

[ghstack-poisoned]
mrshenli added a commit that referenced this pull request Nov 19, 2019
As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

[ghstack-poisoned]
mrshenli added a commit that referenced this pull request Nov 19, 2019
As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

ghstack-source-id: a451c10
Pull Request resolved: #30100
facebook-github-bot pushed a commit that referenced this pull request Nov 19, 2019
Summary:
Pull Request resolved: #30098

As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

Test Plan: Imported from OSS

Differential Revision: D18597001

Pulled By: mrshenli

fbshipit-source-id: 68256289085fac1a9ca76d5b4882e97e2f81d1f4
facebook-github-bot pushed a commit that referenced this pull request Nov 19, 2019
Summary:
Pull Request resolved: #30099

As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

Test Plan: Imported from OSS

Differential Revision: D18597003

Pulled By: mrshenli

fbshipit-source-id: ebfb1f6f3f961d98351e06ce4b951793a9b95398
facebook-github-bot pushed a commit that referenced this pull request Nov 19, 2019
Summary:
Pull Request resolved: #30100

As after #29827 we only test RPC using spawn, the multi-thread/fork
error should disappear.

Test Plan: Imported from OSS

Differential Revision: D18597002

Pulled By: mrshenli

fbshipit-source-id: 64aa6a59248e5d1b7e1ad1aebffb6a25248388d2
mrshenli added a commit that referenced this pull request Nov 21, 2019
With #29827, the flakiness should disappear for test_call_method_on_rref

[ghstack-poisoned]
mrshenli added a commit that referenced this pull request Nov 21, 2019
With #29827, the flakiness should disappear for test_call_method_on_rref

ghstack-source-id: 4faece1
Pull Request resolved: #30261
facebook-github-bot pushed a commit that referenced this pull request Nov 22, 2019
Summary:
Pull Request resolved: #30261

With #29827, the flakiness should disappear for test_call_method_on_rref

Test Plan: Imported from OSS

Differential Revision: D18645036

Pulled By: mrshenli

fbshipit-source-id: 44d759062fc78b1a797266096dbb4ddd104f07eb
@facebook-github-bot facebook-github-bot deleted the gh/zhaojuanmao/14/head branch November 22, 2019 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants