python3Packages.pytorch: 1.2.0 -> 1.4.1, python3Packages.ignite: 0.2.1 -> 0.3.0#75827
python3Packages.pytorch: 1.2.0 -> 1.4.1, python3Packages.ignite: 0.2.1 -> 0.3.0#75827bhipple merged 4 commits intoNixOS:masterfrom
Conversation
cb1d7a0 to
f138203
Compare
f138203 to
71c812e
Compare
stites
left a comment
There was a problem hiding this comment.
Removed the nix-protobuf code (it will just live in pytorch-world for now). Rebuilding everything again -- I just had to deal with some faulty RAM slots so updates will have a slightly faster turnaround again.
|
fwiw, 1.4 is now out. |
There may soon also be a 1.4.1 release that is compatible with gcc 9: |
|
@GrahamcOfBorg build python2Packages.pytorch |
|
please address failures :( |
|
looks like a lot of failures are related to |
|
On x86 Ubuntu with Cuda support, the package builds but is broken: Possibly setting Seems fine without Cuda but didn't run the test suite. |
|
@GrahamcOfBorg build python37Packages.pytorch @stites I pushed an update to your PR with the following changes:
I've managed to build this successfully with both the FOSS stack and with |
jonringer
left a comment
There was a problem hiding this comment.
not sure about regressions, but at the very least, we should disable for python38
builder for '/nix/store/a70hd44hmh22wxsr0w7sl3wnrfgfxpjb-python3.7-ignite-0.2.1.drv' failed with exit code 1; last 10 log lines:
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
/nix/store/m4i2i1ax2ch5y1ql8ng7f014mrmk63ja-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:342: DeprecationWarning: To get the last learning rate computed by the scheduler, please use `get_last_lr()`.
"please use `get_last_lr()`.", DeprecationWarning)
-- Docs: https://docs.pytest.org/en/latest/warnings.html
===== 4 failed, 275 passed, 4 skipped, 72 deselected, 20 warnings in 6.45s =====
builder for '/nix/store/zyxn5nxx22jwdj5xhvxx173ngkms6ryi-python3.8-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
----------------------------------------------------------------------
Ran 2276 tests in 93.630s
FAILED (failures=1, skipped=66, expected failures=1)
Traceback (most recent call last):
File "test/run_test.py", line 455, in <module>
main()
File "test/run_test.py", line 448, in main
raise RuntimeError(message)
RuntimeError: test_jit failed!
cannot build derivation '/nix/store/iraap11hf3k4yax6bzpmnmrif3pzli5l-python3.8-ignite-0.2.1.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/8qdsp4ga63vmwbq8w0b1vf33avb848rx-python3.8-tensorly-0.4.5.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/qkmv9p848yh1l62cn9yy3wb6id0hh1ac-python3.8-torchvision-0.2.1.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/11waqaqv5xq4276f1npv1dmpj31hgprx-python3.8-pywick-0.5.6.drv': 2 dependencies couldn't be built
builder for '/nix/store/c8jvyrqblrk4a95rlnv5ji142d7sv33m-python3.7-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
wrapping `/nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/bin/convert-caffe2-to-onnx'...
Executing pythonRemoveTestsDir
Finished executing pythonRemoveTestsDir
running install tests
Traceback (most recent call last):
File "test/run_test.py", line 14, in <module>
import torch
File "/nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: /nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZN5torch4cuda4nccl6detail16throw_nccl_errorE12ncclResult_t
builder for '/nix/store/f2lxq1zcw1zbc4q3snnskdaadm5vyw2x-python3.8-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
wrapping `/nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/bin/convert-caffe2-to-onnx'...
Executing pythonRemoveTestsDir
Finished executing pythonRemoveTestsDir
running install tests
Traceback (most recent call last):
File "test/run_test.py", line 14, in <module>
import torch
File "/nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/lib/python3.8/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: /nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/lib/python3.8/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZN5torch4cuda4nccl6detail16throw_nccl_errorE12ncclResult_t
cannot build derivation '/nix/store/rqcqydv29f29gn957kfb5vysx4ppgnsv-env.drv': 8 dependencies couldn't be built
[0 built (4 failed), 0.0 MiB DL]
error: build of '/nix/store/rqcqydv29f29gn957kfb5vysx4ppgnsv-env.drv' failed
https://github.com/NixOS/nixpkgs/pull/75827
2 package marked as broken and skipped:
python37Packages.pyro-ppl python38Packages.pyro-ppl
8 package failed to build:
python37Packages.ignite python37Packages.pytorchWithCuda python38Packages.ignite python38Packages.pytorch python38Packages.pytorchWithCuda python38Packages.pywick python38Packages.tensorly python38Packages.torchvision
4 package built:
python37Packages.pytorch python37Packages.pywick python37Packages.tensorly python37Packages.torchvision
|
Hmm, I think I'll go back to just not running the test suite -- particularly since users of MKL may be running without a binary cache and need to wait for the package to build themselves. |
|
or just reduce the checkphase to something simple. I would be fine with some As long as the maintainer verified that it worked for more in-depth cases locally |
|
👍 thanks for bumping this pr @bhipple ! |
Co-authored-by: Benjamin Hipple <[email protected]>
- Pass `blas.provider` into `buildInputs`, so that CMake can find the actual `mkl` for inspection of its cmake files and headers. - Add `USE_MKL` correctly when the blas provider is `mkl`. - Use the MKLDNN and MKLDNN_CBLAS flags by default, since `mkldnn` is FOSS and always available.. - Remove a patch for MKL 2019, since we've moved to 2020. - Add a pythonImportsCheck for "torch" as a basic sanity-check - Removed some unused variables at the top of the file
|
Result of 2 packages marked as broken and skipped:- python37Packages.pyro-ppl - python38Packages.pyro-ppl 4 packages failed to build:- python37Packages.ignite - python37Packages.pytorchWithCuda - python38Packages.ignite - python38Packages.pytorchWithCuda 8 packages built:- python37Packages.pytorch (python37Packages.pytorchWithoutCuda) - python37Packages.pywick - python37Packages.tensorly - python37Packages.torchvision - python38Packages.pytorch (python38Packages.pytorchWithoutCuda) - python38Packages.pywick - python38Packages.tensorly - python38Packages.torchvision |
|
@GrahamcOfBorg eval |
|
One last change: in #85839 I changed the name of Updated pytorch to reference |
|
Result of 2 packages failed to build:- python37Packages.pytorchWithCuda - python38Packages.pytorchWithCuda 10 packages built:- python37Packages.ignite - python37Packages.pytorch (python37Packages.pytorchWithoutCuda) - python37Packages.pywick - python37Packages.tensorly - python37Packages.torchvision - python38Packages.ignite - python38Packages.pytorch (python38Packages.pytorchWithoutCuda) - python38Packages.pywick - python38Packages.tensorly - python38Packages.torchvision |
|
great work! thank you all! |
|
what is the best way to get for now I will try to follow #75827 (comment) |
|
that looks like a linking error to cuda. Not super familiar with the cuda toolchain to know where an assumption is being broken |
|
If you manage to get it working, please do send a PR! |
|
Interesting note: when we upgraded pytorch from 1.0.0 -> 1.2.0, we must've leaked a proprietary dependency into the default expression, because Hydra stopped building it. That's now been fixed with this PR, so we once again have binary cache builds of the default pytorch: |
I tried a few things but couldn't make it work with the current version. |
Motivation for this change
Update pytorch to 1.3.1. A detailed commit history can be seen here.
Check phase verified on python36.pytorch, python36.pytorchWithMkl, python36.pytorchWithCuda10, python36.pytorchWithCuda10Mkl.
Cachix pending (my machine with the keys is down).
Relevant changelog:
buildDocsflag addedbuildNamedTensoris now true by defaultuseNixProtobufbut disables this functionality.To build
I believe the following should work:
Things done
sandboxinnix.confon non-NixOS linux)nix-shell -p nix-review --run "nix-review wip"./result/bin/)nix path-info -Sbefore and after)Notify maintainers
cc @teh @thoughtpolice @stites @tscholak @bhipple