Conversation
* FIXME fix cuda and test gcp build variants before merging
9915702 to
cfd347d
Compare
The way that you need to think about things is that you can't assume Spack will always have internet access. Some clusters are behind firewalls or are completely offline. Spack allows users to create a mirror of fetched tarballs, copy that mirror to a different cluster, and install those packages. So all dependencies need to be added to Spack so that Bazel doesn't need to download anything. |
|
Also so that Spack can actually control what they're built with. Spack wants to be able to guarantee you that things are built with a consistent compiler stack. If bazel builds a bunch of stuff behind the scenes that Spack doesn't know about, Spack can't do this. |
@tgamblin Do you think this will be a problem? |
|
Can Bazel instead be configured to depend on external versions of the dependencies? If so, then no. If not, then yeah we probably have to think about that. |
|
Bazel sounds very convenient, but it results in unrepeatable builds if we can't control or record how a package's dependencies are built. I think we'll have to add |
ACK, however in this case (for this tensorflow package):
Have a look at this output: # call ldd for each installed file, do some filtering on errors and count occurrences
$ find /path/to/spack/opt/linux-y/gcc-z/tensorflow-XYZ -exec ldd {} \; 2>&1 | egrep -v "not ((a dynamic executable)|(regular file))" | sed 's/(0x[0-9a-f]*)$//' | sort | uniq -c
16 /lib64/ld-linux-x86-64.so.2
16 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
1 libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
16 libgcc_s.so.1 => /path/to/spack/opt/linux-y/gcc-z/gcc-6.2.0-fhir7awimw3chugjsa25vrgn2xkf3lij/lib64/libgcc_s.so.1
16 libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
15 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
16 libstdc++.so.6 => /path/to/spack/opt/linux-y/gcc-z/gcc-6.2.0-fhir7awimw3chugjsa25vrgn2xkf3lij/lib64/libstdc++.so.6
1 libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1
16 linux-vdso.so.1 This looks quite reasonable for a spack package (e.g. $ find . -name "*.so"
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/factorization/python/ops/_factorization_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/factorization/python/ops/_clustering_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/layers/python/ops/_sparse_feature_cross_op.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/layers/python/ops/_bucketization_op.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/rnn/python/ops/_lstm_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/tensor_forest/python/ops/_inference_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/tensor_forest/python/ops/_training_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/tensor_forest/data/_data_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/ffmpeg/ffmpeg.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/metrics/python/ops/_set_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/linear_optimizer/python/ops/_sdca_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/quantization/kernels/_quantized_kernels.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/contrib/quantization/_quantized_ops.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/tensorflow/python/_pywrap_tensorflow.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/external/protobuf/internal/_api_implementation.so
./lib/python2.7/site-packages/tensorflow-0.10.0-py2.7.egg/external/protobuf/pyext/_message.soIn short: I think that the tensorflow package is being built in a reproducible/robust way.
This would a be a nice solution, but unfortunately I didn't find any options for this.
ACK, this is a problem. |
|
I guess we could use |
|
@tgamblin You mean after staging/extracting the tensorflow package? Except for the one above, all other dependencies are specified as plain urls: If we can use the |
|
@tgamblin Alternatively we could patch tensorflow's |
|
@muffgaga What is the status of this PR ? |
|
Are the dependencies rpath linked? The dependencies are installed via Bazel. |
On our last build (using spack from 2017-01-26) and
Current status: I'll provide an update for |
|
Related to #3244 |
|
@muffgaga I added the 'up-for-grabs' tag as this was inactive for a while. Feel free to complete the PR and remove the tag. I bet you'll save the day to a lot of Spack users. |
|
The TF guys keep adding packages that makes this hard to package up. |
|
Tensorflow now requires bazel >= 5.4. The newer versions of bazel do not install with the existing template. I cannot figure out the problem. I think it has to do with these patches: I turned off these scripts however that produces various different errors. With them all on: Has anyone tried to install latest bazel? |
|
I'd really like to see a version if this that doesn't require bazel, e.g. the one in #3244 |
Here's a preliminary PR for the tensorflow package (we're still working on the two non-default build variants).
Tensorflow is built using the bazel build tool.
Bazel pulls in dependencies (cf. WORKSPACE file) for the build process;
these dependencies are installed to the same prefix directory as tensorflow (i.e. similar to what python setuptools does).
Should we try to integrate all dependencies into spack first?