Skip to content

generate-xml.sh fails to execute #12579

@Flamefire

Description

@Flamefire

Description of the problem / feature request:

I'm running some test of TensorFlow using bazel but on our multi-core POWER9 system it fails with e.g.

ERROR: /dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/TensorFlow/tensorflow-r2.4/tensorflow/core/platform/BUILD:1142:11: failed (Exit 1): generate-xml.sh failed: error executing command

I.e. there is no good error message, it simply failed to execute that script which comes from the Bazel installation. I verified that the executed command (bazel -s) runs correctly and the script hence also exists

I even modified that script in the Bazel sources to print something at the start but that doesn't show up. So it seems that script is not (yet?) created when Bazel tries to execute it. I hence expect a race condition or something but am unable to verify this.

Any hints, ideas, ...?

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Sorry, only thing I have is the command I use to test TF:

bazel --output_base=/dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/tmptspeEg-bazel-tf/output_base --install_base=/dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/tmptspeEg-bazel-tf/output_base/inst_base --output_user_root=/dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/tmptspeEg-bazel-tf/output_user_root --host_jvm_args=-Xms512m --host_jvm_args=-Xmx4096m test --compilation_mode=opt --config=opt --subcommands --verbose_failures --config=noaws --jobs=64 --copt="-fPIC"  --distinct_host_configuration=false --test_output=errors --local_test_jobs=1 --build_tests_only --test_tag_filters='-no_gpu,-no_oss,-oss_serial,-benchmark-test,-no_oss_py37,-v1only' --build_tag_filters='-no_gpu,-no_oss,-oss_serial,-benchmark-test,-no_oss_py37,-v1only'  -- //tensorflow/core/... //tensorflow/cc/... //tensorflow/c/... -//tensorflow/core:example_java_proto -//tensorflow/core/example:example_protos_closure

What operating system are you running Bazel on?

RHEL 7.6

What's the output of bazel info release?

release 3.4.1- (@Non-Git)

If bazel info release returns "development version" or "(@Non-Git)", tell us how you built Bazel.

EXTRA_BAZEL_ARGS="--jobs=176 --host_javabase=@local_jdk//:jdk" ./compile.sh

Have you found anything relevant by searching the web?

No

Any other information, logs, or outputs that you want to share?

ERROR: /dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/TensorFlow/tensorflow-r2.4/tensorflow/core/platform/BUILD:1142:11:  failed (Exit 1): generate-xml.sh failed: error executing command 
  (cd /dev/shm/s3248973-EasyBuild/TensorFlow/2.4.0/fosscuda-2019b-Python-3.7.4/tmptspeEg-bazel-tf/output_base/execroot/org_tensorflow && \
  exec env - \
    PATH=/usr/bin:/bin \
    TEST_BINARY=tensorflow/core/platform/platform_strings_test \
    TEST_NAME=//tensorflow/core/platform:platform_strings_test \
    TEST_SHARD_INDEX=0 \
    TEST_TOTAL_SHARDS=0 \
  external/bazel_tools/tools/test/generate-xml.sh bazel-out/ppc-opt/testlogs/tensorflow/core/platform/platform_strings_test/test.log bazel-out/ppc-opt/testlogs/tensorflow/core/platform/platform_strings_test/test.xml 0 0)
Execution platform: @local_execution_config_platform//:platform

Metadata

Metadata

Assignees

Labels

P1I'll work on this now. (Assignee required)team-OSSIssues for the Bazel OSS team: installation, release processBazel packaging, websitetype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions