TensorFlow/Keras: add versions 2.8–2.9#31615
Conversation
|
Successfully builds on Ubuntu 20.04 with Bazel 5.2.0 and GCC 8.4.0. Still doesn't build on aarch64, but neither does the develop version, so this is probably safe to merge. Will keep working on aarch64 support. |
nicholas-sly
left a comment
There was a problem hiding this comment.
I don't see any issues with this PR.
|
There is an issue with GCC and Cuda versions that will need to be accounted for. I was building with GCC-11.3 and CUDA-11.4.4 but seeing failures like: I did some searching and this is being reported in a few places, for instance: NVlabs/instant-ngp#119 In the end it looks like a In addition, the following constraint in py-tensorflow will need to be adjusted to allow CUDA-11.6 to get picked up when GCC-11.2+ is being used. I will try a few more builds to try to pinpoint what that needs to be. Unfortunately, building this takes a while. |
|
@ax3l @Rombur @davidbeckingsale should this first CUDA conflict go in the |
|
Yes, if it's not already there? |
|
This PR will also need #28548 merged for older versions of bazel to build with gcc-11+. I just rebased that PR and tested it in tandem with this one. |
glennpj
left a comment
There was a problem hiding this comment.
The cuda dependencies will need to look like the following:
depends_on('cuda', when='+cuda')
depends_on('cuda@:10.2', when='+cuda @:2.3')
depends_on('cuda@:11.4', when='+cuda @2.4:2.7')
982dada to
37b6d03
Compare
37b6d03 to
e35a7ef
Compare
aweits
left a comment
There was a problem hiding this comment.
Looks good.
[aweits@skl-a-00 spack]$ spack find -lv py-tensorflow
==> 1 installed package
-- linux-rhel7-skylake_avx512 / [email protected] -----------------------
oic7arz [email protected]~android~aws~computecpp+cuda~dynamic_kernels~gcp~gdr~hdfs~ignite~ios~jemalloc~kafka~mkl~monolithic~mpi+nccl~ngraph~numa~opencl~rocm~tensorrt~verbs~xla cuda_arch=60,61,70,75,80 patches=2017b3e
Our TF package has become woefully out-of-date. This PR is an attempt to add the latest versions of TF.
So far I just updated the dependency list, didn't have the energy to dig much deeper than that. Unfortunately, the installation fails almost immediately with:
Unfortunately, bazelbuild/bazel#10134 provides neither details nor migration instructions. Would love help with this if anyone is more knowledgeable than I am.