Skip to content

Add CUDA 11.3 + PyTorch 1.11.0 and 1.12.1#4205

Closed
thiagocrepaldi wants to merge 21 commits intofacebookresearch:mainfrom
thiagocrepaldi:thiagofc/update-build-cuda
Closed

Add CUDA 11.3 + PyTorch 1.11.0 and 1.12.1#4205
thiagocrepaldi wants to merge 21 commits intofacebookresearch:mainfrom
thiagocrepaldi:thiagofc/update-build-cuda

Conversation

@thiagocrepaldi
Copy link
Copy Markdown
Contributor

@thiagocrepaldi thiagocrepaldi commented May 2, 2022

Currently CI supports only CUDA 11.1, which is sufficient for PyTorch versions up to 1.10.
For PyTorch 1.11+ and nightly builds, the minimum CUDA version is 11.3.

This PR adds 3 new pipelines:

  • Linux CUDA 11.3 which installs PyTorch, TorchVision, Detectron2 nightly builds.
  • Linux CUDA 11.3 which installs PyTorch 1.11 and 1.12 stable release with matching TorchVision and Detectron2 builds.
  • Linux CPU which installs PyTorch, TorchVision, Detectron2 nightly builds.
  • Windows CPU which install PyTorch, TorchVision, Detectron2 nightly builds.

The new pipelines are important to test end-to-end ONNX export tests, in which PyTorch inference results are numerically compared with the ONNX Runtime's . A second benefit is help identifying ONNX export issues earlier in the development cycle.

ps: Nightly builds are only installed in the new pipelines (aka windows_cpu_build_pytorch_master, linux_cuda113_tests_pytorch_master_python39, linux_cpu_tests_pytorch_master. All existing pipelines are intact.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 2, 2022
@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/update-build-cuda branch 29 times, most recently from 9a76428 to 538226c Compare May 3, 2022 22:08
@thiagocrepaldi
Copy link
Copy Markdown
Contributor Author

@thiagocrepaldi Thanks for updating it, overall it looks promising. Some of the CIs didn't pass because indexes mismatch, have you found the reason, it seems that it changes how pytorch and torchvision are installed.

yeah, i saw that and I honestly ran out of tricks to force the right cuda version. It seems that by PEP400 (if I am not mistaken), the "+" is called "local version label" and it is not used for version selection. The pytorch's URL for cu113 also has wheels for a bunch of other cuda versions, and this mismatch is happening

Comment thread .circleci/config.yml Outdated
Comment thread .circleci/config.yml
Comment on lines +272 to +278
- <<: *setupcuda113
- <<: *removecuda114
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it remove 114 first then setup 113?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing 11.4 first and installing 11.3 would result in uninstalling several shared components only to reinstall later, increasing installation time. Keeping the current order does not affect the end result and save some time

@thiagocrepaldi
Copy link
Copy Markdown
Contributor Author

@wat3rBro everything is green now

@thiagocrepaldi
Copy link
Copy Markdown
Contributor Author

@wat3rBro gentle ping

Thiago Crepaldi added 20 commits April 5, 2023 15:08
All non pytorch master pipelines are unchanged
3 new pileines are introduced.
* 1) Windows CPU and 2) Linux CPU with pytorch/torchvision nightly builds
* 2) Linux with CUDA 11.3, python 3.9 and nightly pytorch/torchvision

CUDA 11.3 is needed because PyTorch does not distribute wheels for
nightly 11.1

Python 3.9 was needed for the same reason; no torch wheels for 3.6
@thiagocrepaldi
Copy link
Copy Markdown
Contributor Author

@wat3rBro @ppwwyyxx gentle ping

@thiagocrepaldi
Copy link
Copy Markdown
Contributor Author

Is this still relevant or should we close it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants