Description of the bug:
When checking whether a local action cache entry is up-to-date, it takes a long time to check actions that have large tree artifacts on their inputs. The stack trace when Bazel is working on this is:
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes([email protected]/Native Method)
at java.io.FileInputStream.read([email protected]/Unknown Source)
at com.google.common.io.ByteStreams.copy(ByteStreams.java:114)
at com.google.common.io.ByteSource.copyTo(ByteSource.java:257)
at com.google.common.io.ByteSource.hash(ByteSource.java:340)
at com.google.devtools.build.lib.vfs.FileSystem.getDigest(FileSystem.java:339)
at com.google.devtools.build.lib.unix.UnixFileSystem.getDigest(UnixFileSystem.java:452)
at com.google.devtools.build.lib.vfs.Path.getDigest(Path.java:690)
at com.google.devtools.build.lib.vfs.DigestUtils.manuallyComputeDigest(DigestUtils.java:194)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.constructFileArtifactValue(ActionMetada
taHandler.java:564)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.constructFileArtifactValueFromFilesyste
m(ActionMetadataHandler.java:496)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.lambda$constructTreeArtifactValueFromFi
lesystem$0(ActionMetadataHandler.java:354)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler$$Lambda$1121/0x0000000800857040.visit(Unknown Source)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:411)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:414)
at com.google.devtools.build.lib.skyframe.TreeArtifactValue.visitTree(TreeArtifactValue.java:393)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.constructTreeArtifactValueFromFilesystem(ActionMetadataHandler.java:342)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.getTreeArtifactValue(ActionMetadataHandler.java:317)
at com.google.devtools.build.lib.skyframe.ActionMetadataHandler.getMetadata(ActionMetadataHandler.java:265)
at com.google.devtools.build.lib.actions.ActionCacheChecker.getMetadataOrConstant(ActionCacheChecker.java:566)
at com.google.devtools.build.lib.actions.ActionCacheChecker.getMetadataMaybe(ActionCacheChecker.java:579)
at com.google.devtools.build.lib.actions.ActionCacheChecker.validateArtifacts(ActionCacheChecker.java:207)
at com.google.devtools.build.lib.actions.ActionCacheChecker.mustExecute(ActionCacheChecker.java:541)
My theory is that this is because the visitation happens on a single thread in TreeArtifactValue.visitTree() when called from ActionMetadataHandler.constructTreeArtifactValueFromFilesystem().
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Take this BUILD file:
touch WORKSPACE
mkdir -p r
cat > r/BUILD <<'EOF'
load(":r.bzl", "r")
r(name = "ta")
genrule(
name = "c",
srcs = [":ta"],
outs = ["co"],
cmd = "find $(location :ta) > $@",
)
sh_binary(
name = "gen",
srcs = ["gen.sh"],
)
EOF
cat > r/r.bzl << 'EOF'
def _r_impl(ctx):
ta = ctx.actions.declare_directory("d")
ctx.actions.run(
outputs = [ta],
inputs = [],
executable = ctx.executable._gen,
arguments = [ta.path],
)
return [DefaultInfo(files = depset([ta]))]
r = rule(
implementation = _r_impl,
attrs = {
"_gen": attr.label(default = "//r:gen", executable = True, cfg = "exec"),
},
)
EOF
cat > r/gen.sh <<'EOF'
#!/bin/bash
OUT="$1"
mkdir -p "$OUT"
for i in $(seq 1 10); do
for j in $(seq 1 10); do
for k in $(seq 1 100); do
mkdir -p "$OUT/$i/$j"
#echo "$i $j $k" > "$OUT/$i/$j/$k"
dd if=/dev/random of="$OUT/$i/$j/$k" bs=1024 count=1024
done
done
done
echo hello > "$OUT/hello"
EOF
chmod +x r/gen.sh
bazel build //r:c
bazel shutdown
bazel build //r:c # This is slow
Which operating system are you running Bazel on?
Linux @ Google
What is the output of bazel info release?
development version
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
From git commit de4746d .
What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
Description of the bug:
When checking whether a local action cache entry is up-to-date, it takes a long time to check actions that have large tree artifacts on their inputs. The stack trace when Bazel is working on this is:
My theory is that this is because the visitation happens on a single thread in
TreeArtifactValue.visitTree()when called fromActionMetadataHandler.constructTreeArtifactValueFromFilesystem().What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Take this BUILD file:
Which operating system are you running Bazel on?
Linux @ Google
What is the output of
bazel info release?development version
If
bazel info releasereturnsdevelopment versionor(@non-git), tell us how you built Bazel.From git commit de4746d .
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response