-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Description
Description of the problem / feature request:
I've seem to have a discovered a bug that is slowing down our remote execution builds. Our build has some pretty large blobs that we don't want to upload to the remote cache since it would take way too long, so this is an example of how we disable those specific actions from running remotely and uploading to the remote cache:
build --modify_execution_info=^(CppLink|ObjcLink)$=+no-remote
With Bazel 5, we also enabled the use of a disk cache with our remote exec build. This is now causing long uploads for actions that are marked as no-remote.
What seems to be happening is the following:
- When using a combined cache, Bazel wrongly asks the remote cache if it contains a certain blob without respecting the
no-remotetag. - If the remote cache doesn't contain it, Bazel will then upload it even though it shouldn't.
I've verified that disabling the disk cache or setting the action mnemonics to no-cache works around the issue.
Feature requests: what underlying problem are you trying to solve with this feature?
Building with remote execution and disk cache should be efficient. This means being able to prevent certain action mnemonics inputs/outputs to be uploaded without disabling the disk cache completely.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
This needs a working remote exec cluster, but I believe it should be reproducible by building https://github.com/bazelbuild/rules_apple/blob/master/examples/ios/HelloWorld/BUILD with the following flags:
build --disk_cache=~/.cache/bazel_disk
build --modify_execution_info=^(CppLink|ObjcLink)$=+no-remote
build --remote_executor=your.remote.exec
build --remote_cache=your.remote.cache
What operating system are you running Bazel on?
macOS 12.1
What's the output of bazel info release?
release 5.0.0
Any other information, logs, or outputs that you want to share?
This is a snippet of our gRPC log with some of the interesting fields:
---------------------------------------------------------
metadata {
tool_details {
tool_name: "bazel"
tool_version: "5.0.0"
}
action_mnemonic: "ObjcLink"
target_id: "//:target"
}
status {
}
method_name: "build.bazel.remote.execution.v2.ContentAddressableStorage/FindMissingBlobs"
details {
find_missing_blobs {
request {
blob_digests {
hash: "1a0fe7ea9f46605fa721fd83d8498ccbae3b2bfa25c98420f429980076022c88"
size_bytes: 324400280
}
}
response {
missing_blob_digests {
hash: "1a0fe7ea9f46605fa721fd83d8498ccbae3b2bfa25c98420f429980076022c88"
size_bytes: 324400280
}
}
}
}
---------------------------------------------------------
metadata {
tool_details {
tool_name: "bazel"
tool_version: "5.0.0"
}
action_mnemonic: "ObjcLink"
target_id: "//:target"
}
status {
}
method_name: "google.bytestream.ByteStream/Write"
details {
write {
resource_names: "uploads/7c80b2be-c6fc-409a-accf-1715bc09417e/compressed-blobs/zstd/1a0fe7ea9f46605fa721fd83d8498ccbae3b2bfa25c98420f429980076022c88/324400280"
resource_names: ""
num_writes: 5084
bytes_sent: 83291264
response {
committed_size: 83291264
}
offsets: 0
finish_writes: 83291264
}
}