Skip to content

TensorFlow/Keras: add versions 2.8–2.9#31615

Merged
adamjstewart merged 1 commit intospack:developfrom
adamjstewart:packages/py-tensorflow
Aug 8, 2022
Merged

TensorFlow/Keras: add versions 2.8–2.9#31615
adamjstewart merged 1 commit intospack:developfrom
adamjstewart:packages/py-tensorflow

Conversation

@adamjstewart
Copy link
Copy Markdown
Member

Our TF package has become woefully out-of-date. This PR is an attempt to add the latest versions of TF.

So far I just updated the dependency list, didn't have the energy to dig much deeper than that. Unfortunately, the installation fails almost immediately with:

ERROR: /private/var/folders/j1/68dlgpr91vlgs26vty2c8xk80000gn/T/ajstewart/spack-stage/spack-stage-py-tensorflow-2.9.1-fqze2toml7dh3kzf6ew7cnayqgidwvhy/spack-src/tensorflow/tools/build_info/BUILD:9:10: While resolving toolchains for target //tensorflow/tools/build_info:gen_build_info: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.

Unfortunately, bazelbuild/bazel#10134 provides neither details nor migration instructions. Would love help with this if anyone is more knowledgeable than I am.

@adamjstewart
Copy link
Copy Markdown
Member Author

Successfully builds on Ubuntu 20.04 with Bazel 5.2.0 and GCC 8.4.0. Still doesn't build on aarch64, but neither does the develop version, so this is probably safe to merge. Will keep working on aarch64 support.

@adamjstewart adamjstewart marked this pull request as ready for review July 20, 2022 16:57
@adamjstewart adamjstewart changed the title TensorFlow: add versions 2.8–2.9 TensorFlow/Keras: add versions 2.8–2.9 Jul 23, 2022
@melven melven mentioned this pull request Jul 27, 2022
nicholas-sly
nicholas-sly previously approved these changes Jul 29, 2022
Copy link
Copy Markdown
Contributor

@nicholas-sly nicholas-sly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues with this PR.

@glennpj
Copy link
Copy Markdown
Contributor

glennpj commented Jul 30, 2022

There is an issue with GCC and Cuda versions that will need to be accounted for. I was building with GCC-11.3 and CUDA-11.4.4 but seeing failures like:

error: parameter packs not expanded with ‘...’

I did some searching and this is being reported in a few places, for instance: NVlabs/instant-ngp#119

In the end it looks like a conflicts statement will need to be added to the cuda build system:

conflicts('%[email protected]:', when='+cuda ^cuda@:11.5')

In addition, the following constraint in py-tensorflow will need to be adjusted to allow CUDA-11.6 to get picked up when GCC-11.2+ is being used.

depends_on('cuda@:11.4', when='+cuda @2.4:')

I will try a few more builds to try to pinpoint what that needs to be. Unfortunately, building this takes a while.

@adamjstewart
Copy link
Copy Markdown
Member Author

@ax3l @Rombur @davidbeckingsale should this first CUDA conflict go in the CudaPackage base class?

@ax3l
Copy link
Copy Markdown
Member

ax3l commented Jul 30, 2022

Yes, if it's not already there?

@glennpj
Copy link
Copy Markdown
Contributor

glennpj commented Jul 31, 2022

This PR will also need #28548 merged for older versions of bazel to build with gcc-11+. I just rebased that PR and tested it in tandem with this one.

Copy link
Copy Markdown
Contributor

@glennpj glennpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cuda dependencies will need to look like the following:

    depends_on('cuda', when='+cuda')
    depends_on('cuda@:10.2', when='+cuda @:2.3')
    depends_on('cuda@:11.4', when='+cuda @2.4:2.7')

@adamjstewart adamjstewart requested a review from glennpj August 5, 2022 21:55
Copy link
Copy Markdown
Contributor

@aweits aweits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

[aweits@skl-a-00 spack]$ spack find -lv py-tensorflow
==> 1 installed package
-- linux-rhel7-skylake_avx512 / [email protected] -----------------------
oic7arz [email protected]~android~aws~computecpp+cuda~dynamic_kernels~gcp~gdr~hdfs~ignite~ios~jemalloc~kafka~mkl~monolithic~mpi+nccl~ngraph~numa~opencl~rocm~tensorrt~verbs~xla cuda_arch=60,61,70,75,80 patches=2017b3e

@adamjstewart adamjstewart merged commit c0493f9 into spack:develop Aug 8, 2022
@adamjstewart adamjstewart deleted the packages/py-tensorflow branch August 8, 2022 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants