Skip to content

Conversation

@jithunnair-amd
Copy link
Collaborator

No description provided.

@pytorchbot pytorchbot added the module: tests Issues related to tests (not the torch.testing module) label Oct 8, 2019
@pytorchbot pytorchbot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 8, 2019
@jithunnair-amd jithunnair-amd changed the title [WIP-DONT-MERGE] Enable mgpu unit tests for rocm Enable mgpu unit tests for rocm Oct 11, 2019
@jithunnair-amd
Copy link
Collaborator Author

jithunnair-amd commented Oct 11, 2019

@bddppq @pietern Can someone from your team please review this PR? It's meant to enable the test_c10d and test_nccl unit test suites for ROCm, while skipping the tests in those suites that don't work currently. I haven't enabled test_distributed unit test suite for ROCm yet, because it will need skipping of gloo and mpi backend for ROCm (as they're not fully tested yet), and will also need changes to test_distributed.py to properly skip the tests without error. I wasn't sure if I should do that since it seems like there might be some refactoring happening to eventually use similar code that exists in common_distributed.py

Copy link
Contributor

@bddppq bddppq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

)


def skip_for_rocm(func):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel it's better to name it "skip_if_rocm" to be more consistent with the skipIfRocm decorator

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Will update.

@bddppq bddppq added the module: rocm AMD GPU support for Pytorch label Oct 11, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@bddppq merged this pull request in 6eef469.

thiagocrepaldi pushed a commit to thiagocrepaldi/pytorch that referenced this pull request Feb 4, 2020
Summary: Pull Request resolved: pytorch#27518

Differential Revision: D17880153

Pulled By: bddppq

fbshipit-source-id: 5b6210104ec66747558a08f97dda1e7796f681df
@jithunnair-amd jithunnair-amd deleted the enable_mgpu_unit_tests_for_rocm branch February 7, 2020 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: rocm AMD GPU support for Pytorch module: tests Issues related to tests (not the torch.testing module) oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants