Skip to content

sci-libs/pytorch: add PyTorch 1.10.1#1123

Closed
Miezhiko wants to merge 10 commits intogentoo:masterfrom
Masha:pytorch
Closed

sci-libs/pytorch: add PyTorch 1.10.1#1123
Miezhiko wants to merge 10 commits intogentoo:masterfrom
Masha:pytorch

Conversation

@Miezhiko
Copy link
Copy Markdown
Contributor

@Miezhiko Miezhiko commented Nov 11, 2021

@Miezhiko Miezhiko changed the title PyTorch 1.10 sci-libs/pytorch: add PyTorch 1.10 Nov 11, 2021
@heroxbd
Copy link
Copy Markdown
Contributor

heroxbd commented Nov 12, 2021

Thank you @Miezhiko !

@littlewu2508 would you like to give a review?

@littlewu2508
Copy link
Copy Markdown
Contributor

@Miezhiko Can we use the system pybind headers and libs for compilation rather than using the third_party (stay the same as pytorch-1.8,1.9)?

@Miezhiko
Copy link
Copy Markdown
Contributor Author

@littlewu2508 don't know, in this I'm just removing third party pylib in src_install to avoid conflicts

@Miezhiko
Copy link
Copy Markdown
Contributor Author

@Miezhiko
Copy link
Copy Markdown
Contributor Author

@littlewu2508 updated

@littlewu2508
Copy link
Copy Markdown
Contributor

In distutils-r1_python_compile, when running python3.9 setup.py build, following error occurs:

Building wheel torch-1.10.0a0+gitUnknown
-- Building version 1.10.0a0+gitUnknown
running build
running build_py
creating /tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0_build-python3_9/lib/torch
...
running build_ext
error: [Errno 2] No such file or directory: 'build/CMakeCache.txt'

This error also occurs in the 1.9.0 version. Have you encountered a similar issue?

@Miezhiko
Copy link
Copy Markdown
Contributor Author

mhm, seems like building with python flag is broken

@littlewu2508
Copy link
Copy Markdown
Contributor

mhm, seems like building with python flag is broken

Leave this to me. I've got a solution.

I'll also update some patches for the ROCm.

@Miezhiko
Copy link
Copy Markdown
Contributor Author

Okay, cool with me!

rm -rfv "${ED}/usr/${LIB}/cmake"

if use python; then
scanelf -r --fix "${BUILD_DIR}/caffe2/python"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|| die

python_foreach_impl python_optimize
fi

find "${ED}/usr/${LIB}" -name "*.a" -exec rm -fv {} \;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

|| die


find "${ED}/usr/${LIB}" -name "*.a" -exec rm -fv {} \;

use test && rm -rfv "${ED}/usr/test" "${ED}"/usr/bin/test_{api,jit}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use test && rm -rfv "${ED}/usr/test" "${ED}"/usr/bin/test_{api,jit}
if use test; then
rm -r "${ED}/usr/test" "${ED}"/usr/bin/test_{api,jit} || die
fi

cp -rv "${WORKDIR}/${P}/third_party/pybind11/include/pybind11" "${ED}/usr/include/"

rm -fv "${ED}/usr/${LIB}/libtbb.so"
rm -rfv "${ED}/usr/${LIB}/cmake"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rm -rfv "${ED}/usr/${LIB}/cmake"
rm -r "${ED}/usr/${LIB}/cmake" || die


cp -rv "${WORKDIR}/${P}/third_party/pybind11/include/pybind11" "${ED}/usr/include/"

rm -fv "${ED}/usr/${LIB}/libtbb.so"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rm -fv "${ED}/usr/${LIB}/libtbb.so"
rm -v "${ED}/usr/${LIB}/libtbb.so" || die

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in the other lines above too.

If external commands are called we need to call die if they fail to stop the ebuild there. Also drop the -f argument, as this causes rm to exit successfully even if the file does not exist and therefore interferes with || die

KEYWORDS="~x86 ~amd64"

IUSE="asan blas cuda +fbgemm ffmpeg gflags glog +gloo leveldb lmdb mkldnn mpi namedtensor +nnpack numa +observers opencl opencv +openmp +python +qnnpack redis rocm static test tools zeromq"
RESTRICT="!test? ( test )"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RESTRICT="!test? ( test )"

SLOT="0"
KEYWORDS="~x86 ~amd64"

IUSE="asan blas cuda +fbgemm ffmpeg gflags glog +gloo leveldb lmdb mkldnn mpi namedtensor +nnpack numa +observers opencl opencv +openmp +python +qnnpack redis rocm static test tools zeromq"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IUSE="asan blas cuda +fbgemm ffmpeg gflags glog +gloo leveldb lmdb mkldnn mpi namedtensor +nnpack numa +observers opencl opencv +openmp +python +qnnpack redis rocm static test tools zeromq"
IUSE="asan blas cuda +fbgemm ffmpeg gflags glog +gloo leveldb lmdb mkldnn mpi namedtensor +nnpack numa +observers opencl opencv +openmp +python +qnnpack redis rocm static tools zeromq"


LICENSE="BSD"
SLOT="0"
KEYWORDS="~x86 ~amd64"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort the keywords, it helps with some tools if they are in the expected order. And also, repoman will complain if they are not alphabetically.

Suggested change
KEYWORDS="~x86 ~amd64"
KEYWORDS="~amd64 ~x86"

# Copyright 1999-2021 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

EAPI=7
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we bump this to EAPI=8 while we are touching this package anyway?


EAPI=7

PYTHON_COMPAT=( python3_{7..10} )
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PYTHON_COMPAT=( python3_{7..10} )
PYTHON_COMPAT=( python3_{8..10} )

@Miezhiko
Copy link
Copy Markdown
Contributor Author

@AndrewAmmerlaan thank you for review but first I'd wait for python bindings fix from @littlewu2508

@heroxbd
Copy link
Copy Markdown
Contributor

heroxbd commented Dec 1, 2021

mhm, seems like building with python flag is broken

Leave this to me. I've got a solution.

I'll also update some patches for the ROCm.

Hi, do you have a PR somewhere for this?

@littlewu2508
Copy link
Copy Markdown
Contributor

mhm, seems like building with python flag is broken

Leave this to me. I've got a solution.
I'll also update some patches for the ROCm.

Hi, do you have a PR somewhere for this?

I have some local changes that fixes python bindings and rocm related issues but not ready for PR yet. Recently all the rocm dependencies are finally merged into portage tree, so I suggest we can land pytorch into ::gentoo.

@AndrewAmmerlaan thank you for review but first I'd wait for python bindings fix from @littlewu2508

@Miezhiko You can refine your ebuild according to the review comments anyway, then merge it into master. I'll open another PR to sci to fix python bindings and rocm-4.3 support. After that let's consider landing it to ::gentoo

@littlewu2508
Copy link
Copy Markdown
Contributor

mhm, seems like building with python flag is broken

Leave this to me. I've got a solution.
I'll also update some patches for the ROCm.

Hi, do you have a PR somewhere for this?

Meanwhile, can open a PR for pytorch-1.9.1 to replace pytorch-1.9.0, which is also broken in python and rocm.

@heroxbd
Copy link
Copy Markdown
Contributor

heroxbd commented Dec 1, 2021

@littlewu2508 If you are ready with the Python binding fix, just open a new PR based on this one. Just donot forget to acknownledge @Miezhiko.

@Nowa-Ammerlaan Nowa-Ammerlaan self-assigned this Dec 13, 2021
@Nowa-Ammerlaan
Copy link
Copy Markdown
Member

I'm getting a compile failure (happens both with default flags, and with all disabled):

FAILED: third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o 
/usr/bin/x86_64-pc-linux-gnu-g++ -DHAVE_A_OUT_H -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../third_party/benchmark/include -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../caffe2/contrib/opencl -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0_build/caffe2/contrib/aten -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0_build/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0_build/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src -I/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/third_party/linux/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0_build/third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../third_party/googletest/googlemock/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../third_party/googletest/googletest/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/gemmlowp -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/neon2sse -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/XNNPACK/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party -isystem /usr/include/eigen3 -isystem /usr/include/python3.9 -isystem /usr/lib/python3.9/site-packages/numpy/core/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/cmake/../third_party/pybind11/include  -march=native -O2 -pipe -frecord-gcc-switches -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -fPIC -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -std=gnu++14 -MD -MT third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o -MF third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o.d -o third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o -c /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc: In function ‘void google_breakpad::{anonymous}::InstallAlternateStackLocked()’:
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: error: no matching function for call to ‘max(int, long int)’
141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
|                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/char_traits.h:39,
from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/string:40,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.h:38,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:66:
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algobase.h:254:5: note: candidate: ‘template<class _Tp> constexpr const _Tp& std::max(const _Tp&, const _Tp&)’
254 |     max(const _Tp& __a, const _Tp& __b)
|     ^~~
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algobase.h:254:5: note:   template argument deduction/substitution failed:
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   deduced conflicting types for parameter ‘const _Tp’ (‘int’ and ‘long int’)
141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
|                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/char_traits.h:39,
from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/string:40,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.h:38,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:66:
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algobase.h:300:5: note: candidate: ‘template<class _Tp, class _Compare> constexpr const _Tp& std::max(const _Tp&, const _Tp&, _Compare)’
300 |     max(const _Tp& __a, const _Tp& __b, _Compare __comp)
|     ^~~
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algobase.h:300:5: note:   template argument deduction/substitution failed:
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   deduced conflicting types for parameter ‘const _Tp’ (‘int’ and ‘long int’)
141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
|                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/algorithm:62,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:85:
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algo.h:3461:5: note: candidate: ‘template<class _Tp> constexpr _Tp std::max(std::initializer_list<_Tp>)’
3461 |     max(initializer_list<_Tp> __l)
|     ^~~
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algo.h:3461:5: note:   template argument deduction/substitution failed:
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   mismatched types ‘std::initializer_list<_Tp>’ and ‘int’
141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
|                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/algorithm:62,
from /var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:85:
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algo.h:3467:5: note: candidate: ‘template<class _Tp, class _Compare> constexpr _Tp std::max(std::initializer_list<_Tp>, _Compare)’
3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
|     ^~~
/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.1/include/g++-v11/bits/stl_algo.h:3467:5: note:   template argument deduction/substitution failed:
/var/tmp/portage/sci-libs/pytorch-1.10.0/work/pytorch-1.10.0/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   mismatched types ‘std::initializer_list<_Tp>’ and ‘int’
141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
|                                         ~~~~~~~~^~~~~~~~~~~~~~~~~

@Miezhiko Miezhiko changed the title sci-libs/pytorch: add PyTorch 1.10 sci-libs/pytorch: add PyTorch 1.10.1 Dec 22, 2021
improve ebuild quality + switch to EAPI 8

Signed-off-by: Miezhiko <[email protected]>
@Miezhiko
Copy link
Copy Markdown
Contributor Author

@AndrewAmmerlaan updated ebuild, but it still has same problems with python and maybe some rocm patches? what @littlewu2508 said

@Miezhiko
Copy link
Copy Markdown
Contributor Author

(fixed breakpad problem too)

@littlewu2508
Copy link
Copy Markdown
Contributor

(fixed breakpad problem too)

Thanks for fixing breakpad. I've compiling with rocm support, and if everything goes well I'll update the changes.

@Miezhiko
Copy link
Copy Markdown
Contributor Author

@littlewu2508 please pull small fix for src_install needed after some ebuild updates

@Miezhiko
Copy link
Copy Markdown
Contributor Author

Also trying to update breakpad on upstream
pytorch/pytorch#70298

Other changes:

1. Pytorch seems to be work fine with different version of protobuf, so
   no need to specify 0/30; however when protobuf version changed torch
   should be rebuilt.

2. BUILD_NAMEDTENSOR is not an option now

3. Verbose mode for rm in src_install

Package-Manager: Portage-3.0.22, Repoman-3.0.3
Signed-off-by: Yiyang Wu <[email protected]>
@littlewu2508
Copy link
Copy Markdown
Contributor

Masha#1 I create a PR to your branch for pushing python binding and rocm support

sci-libs/pytorch: fix python and update rocm support
@Miezhiko
Copy link
Copy Markdown
Contributor Author

@littlewu2508 thank you, merged.

@Nowa-Ammerlaan
Copy link
Copy Markdown
Member

With USE="python" I get this error:

* python3_8: running distutils-r1_run_phase distutils-r1_python_compile
python3.8 setup.py build -j 12
fatal: not a git repository (or any parent up to mount point /var/tmp)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Building wheel torch-1.10.0a0+gitUnknown
Traceback (most recent call last):
File "setup.py", line 310, in <module>
cmake = CMake()
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 102, in __init__
self._cmake_command = CMake._get_cmake_command()
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 126, in _get_cmake_command
elif cmake is not None and CMake._get_version(cmake) >= distutils.version.LooseVersion("3.10.0"):
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 137, in _get_version
return distutils.version.LooseVersion(line.strip().split(' ')[2])
AttributeError: module 'distutils' has no attribute 'version'

And with USE="opencl" I get this error:

FAILED: caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o 
/usr/bin/x86_64-pc-linux-gnu-g++ -DADD_BREAKPAD_SIGNAL_HANDLER -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1 -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/benchmark/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../caffe2/contrib/opencl -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/contrib/aten -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/api -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/api/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/../third_party -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/../third_party/breakpad/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/../aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/../aten/src/ATen -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/miniz-2.0.8 -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/kineto/libkineto/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/kineto/libkineto/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/distributed -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/../third_party/catch/single_include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/.. -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src/ATen -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/core/nomnigraph/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/FXdiv/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/c10/.. -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/pthreadpool/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/cpuinfo/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/QNNPACK/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/cpuinfo/deps/clog/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/NNPACK/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm/third_party/asmjit/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/FP16/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/tensorpipe -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/tensorpipe -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/tensorpipe/third_party/libnop/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fmt/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/googletest/googlemock/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/googletest/googletest/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/gemmlowp -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/neon2sse -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/XNNPACK/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party -isystem /usr/include/eigen3 -isystem /usr/include/python3.10 -isystem /usr/lib/python3.10/site-packages/numpy/core/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/pybind11/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/include  -march=native -O2 -pipe -frecord-gcc-switches -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -fPIC -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -DASMJIT_STATIC -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o -MF caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o.d -o caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o -c /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.cc
In file included from /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.cc:1:
/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.h:14:10: fatal error: CL/cl.hpp: No such file or directory
14 | #include <CL/cl.hpp>
|          ^~~~~~~~~~~
compilation terminated.

@Nowa-Ammerlaan
Copy link
Copy Markdown
Member

Thanks @Miezhiko and @littlewu2508 👍

USE="opencl" or USE="python" is not quite working yet, but I merged it anyway since it is such a huge improvement over the in-tree ebuild already (since those don't build at all). We can fix the python/opencl stuff in a separate PR

@Miezhiko Miezhiko deleted the pytorch branch December 22, 2021 14:39
@Miezhiko
Copy link
Copy Markdown
Contributor Author

@AndrewAmmerlaan thank you, but I haven't finished yet my own testing after PR merge, still building commit f2e2a09ddb6e48317637a13986e1b347dcc838da but it's good that it's merged too, maybe more people will come with help.

@littlewu2508
Copy link
Copy Markdown
Contributor

With USE="python" I get this error:

* python3_8: running distutils-r1_run_phase distutils-r1_python_compile
python3.8 setup.py build -j 12
fatal: not a git repository (or any parent up to mount point /var/tmp)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Building wheel torch-1.10.0a0+gitUnknown
Traceback (most recent call last):
File "setup.py", line 310, in <module>
cmake = CMake()
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 102, in __init__
self._cmake_command = CMake._get_cmake_command()
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 126, in _get_cmake_command
elif cmake is not None and CMake._get_version(cmake) >= distutils.version.LooseVersion("3.10.0"):
File "/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/tools/setup_helpers/cmake.py", line 137, in _get_version
return distutils.version.LooseVersion(line.strip().split(' ')[2])
AttributeError: module 'distutils' has no attribute 'version'

I can't reproduce this. Maybe you can provide more information? It seems like pytorch tries to search a third party dir.

And with USE="opencl" I get this error:

FAILED: caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o 
/usr/bin/x86_64-pc-linux-gnu-g++ -DADD_BREAKPAD_SIGNAL_HANDLER -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_C10D_GLOO -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1 -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/benchmark/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../caffe2/contrib/opencl -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/contrib/aten -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/onnx -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/foxi -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/api -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/api/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/../third_party -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/../third_party/breakpad/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/../aten/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/../aten/src/ATen -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/miniz-2.0.8 -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/kineto/libkineto/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/kineto/libkineto/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/torch/csrc/distributed -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/TH -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/../third_party/catch/single_include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/.. -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/caffe2/aten/src/ATen -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/core/nomnigraph/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/FXdiv/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/c10/.. -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/pthreadpool/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/cpuinfo/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/QNNPACK/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/cpuinfo/deps/clog/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/NNPACK/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fbgemm/third_party/asmjit/src -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/FP16/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/tensorpipe -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/tensorpipe -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/tensorpipe/third_party/libnop/include -I/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/fmt/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/gloo -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/googletest/googlemock/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/googletest/googletest/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/gemmlowp -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/neon2sse -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party/XNNPACK/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/third_party -isystem /usr/include/eigen3 -isystem /usr/include/python3.10 -isystem /usr/lib/python3.10/site-packages/numpy/core/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/cmake/../third_party/pybind11/include -isystem /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1_build/include  -march=native -O2 -pipe -frecord-gcc-switches -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -fPIC -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -DASMJIT_STATIC -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o -MF caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o.d -o caffe2/CMakeFiles/torch_cpu.dir/contrib/opencl/context.cc.o -c /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.cc
In file included from /var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.cc:1:
/var/tmp/portage/sci-libs/pytorch-1.10.1/work/pytorch-1.10.1/caffe2/contrib/opencl/context.h:14:10: fatal error: CL/cl.hpp: No such file or directory
14 | #include <CL/cl.hpp>
|          ^~~~~~~~~~~
compilation terminated.

I think dev-util/opencl-headers is missing. Should add some dependencies

@Nowa-Ammerlaan
Copy link
Copy Markdown
Member

I think dev-util/opencl-headers is missing. Should add some dependencies

I have that installed, so that is not it.

I can't reproduce this. Maybe you can provide more information? It seems like pytorch tries to search a third party dir.

It is probably related to either the python version or the setuptools version, I'm compiling with python3.10 and setuptools-59.8.0.

@Nowa-Ammerlaan
Copy link
Copy Markdown
Member

I think dev-util/opencl-headers is missing. Should add some dependencies

I have that installed, so that is not it.

According to the portage file list there are several packages providing CL/cl.hpp, I suspect dev-libs/clhpp is the missing dependency we are looking for. Could you verify that you have this package installed, and that pytorch breaks if you remove it?

I can't reproduce this. Maybe you can provide more information? It seems like pytorch tries to search a third party dir.

It is probably related to either the python version or the setuptools version, I'm compiling with python3.10 and setuptools-59.8.0.

It seems that the setup.py file is simply missing a import distutils.version:

andrew@andrew-gentoo-pc ~ % python3.8
Python 3.8.12 (default, Dec 14 2021, 21:16:01)
[GCC 11.2.1 20211127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import distutils
>>> distutils.version.LooseVersion(line.strip().split(' ')[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'distutils' has no attribute 'version'
>>> import distutils.version
>>> distutils.version.LooseVersion(line.strip().split(' ')[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'line' is not defined
>>>

That specific line fails if I just import distutils but works if I also import distutils.version. We might need a patch here, though I am still confused why this works for you but not for me.

@littlewu2508
Copy link
Copy Markdown
Contributor

littlewu2508 commented Dec 24, 2021

According to the portage file list there are several packages providing CL/cl.hpp, I suspect dev-libs/clhpp is the missing dependency we are looking for. Could you verify that you have this package installed, and that pytorch breaks if you remove it?
I didn't install this package, and actually never tried building torch with opencl support. I think you're right about the missing dependency should be dev-libs/clhpp

It is probably related to either the python version or the setuptools version, I'm compiling with python3.10 and setuptools-59.8.0.

It seems that the setup.py file is simply missing a import distutils.version:

andrew@andrew-gentoo-pc ~ % python3.8
Python 3.8.12 (default, Dec 14 2021, 21:16:01)
[GCC 11.2.1 20211127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import distutils
>>> distutils.version.LooseVersion(line.strip().split(' ')[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'distutils' has no attribute 'version'
>>> import distutils.version
>>> distutils.version.LooseVersion(line.strip().split(' ')[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'line' is not defined
>>>

That specific line fails if I just import distutils but works if I also import distutils.version. We might need a patch here, though I am still confused why this works for you but not for me.

I*'m using python3.8 and setuptools-57.4.0-r2. Also, I noticed the that

ag 'import distutils' work/pytorch-1.10.1
work/pytorch-1.10.1/test/test_spectral_ops.py
17:from setuptools import distutils

work/pytorch-1.10.1/tools/build_pytorch_libs.py
9:from setuptools import distutils  # type: ignore[import]

work/pytorch-1.10.1/tools/generate_torch_version.py
5:from setuptools import distutils  # type: ignore[import]

work/pytorch-1.10.1/tools/setup_helpers/cmake.py
11:from setuptools import distutils  # type: ignore[import]

work/pytorch-1.10.1/torch/testing/_internal/common_cuda.py
9:from setuptools import distutils

work/pytorch-1.10.1/torch/testing/_internal/common_methods_invocations.py
40:from setuptools import distutils

work/pytorch-1.10.1/torch/utils/cpp_extension.py
1689:        from setuptools import distutils

work/pytorch-1.10.1/torch/utils/tensorboard/__init__.py
2:from setuptools import distutils
Python 3.8.12 (default, Dec  5 2021, 21:37:18)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from setuptools import distutils
>>> distutils.version.LooseVersion("3.10.0")
LooseVersion ('3.10.0')
>>> distutils.version
<module 'distutils.version' from '/opt/gentoo/usr/lib/python3.8/distutils/version.py'>

Let me try upgrading distutils and see if pytorch breaks.

@littlewu2508
Copy link
Copy Markdown
Contributor

Let me try upgrading distutils and see if pytorch breaks.

Python 3.8.12 (default, Dec  5 2021, 21:37:18)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from setuptools import distutils
>>> distutils.version
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'distutils' has no attribute 'version'

So it's true only with from setuptools import distutils will fail in 59.8.0. This has also been discussed by upstream recently which already have a fix. Let's back-port it.

littlewu2508 added a commit to littlewu2508/sci that referenced this pull request Dec 25, 2021
Backport an upstream PR: pytorch/pytorch#69904

Bug: gentoo#1123 (comment)
Package-Manager: Portage-3.0.22, Repoman-3.0.3
Signed-off-by: Yiyang Wu <[email protected]>
gentoo-bot pushed a commit that referenced this pull request Dec 26, 2021
Reference: pytorch/pytorch@0776756
Reference: #1123 (comment)
Package-Manager: Portage-3.0.22, Repoman-3.0.3
Closes: #1130
Signed-off-by: Yiyang Wu <[email protected]>
Signed-off-by: Benda Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants