Skip to content

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Jun 26, 2020

No description provided.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@jeffdaily
Copy link
Collaborator

@malfet we need a few more docker run arguments. Specifically --shm-size=8g --ipc=host to support mGPU RCCL/NCCL.

Also, do you know when docker images are cleaned up? Do we consider adding --rm here, or should we have cron jobs on our CI hosts for periodically cleaning up old images?

@malfet
Copy link
Contributor Author

malfet commented Jun 26, 2020

@jeffdaily thank you for the feedback. Let me add those. And per today discussion lets add it to the cleanup stage.

@malfet malfet force-pushed the malfet/rocm-add-special-docker-run-rules branch from c3d6245 to 252e0ad Compare June 26, 2020 19:44
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malfet is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@malfet malfet deleted the malfet/rocm-add-special-docker-run-rules branch June 26, 2020 22:04
@facebook-github-bot
Copy link
Contributor

@malfet merged this pull request in edac323.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants