-
Notifications
You must be signed in to change notification settings - Fork 1k
[2.x] FileNotFoundException from corrupted/evicted remote CAS entry fails build instead of falling back to recompute #8889
Description
sbt version: 2.0.0-RC9
Scala version: 3.x
Problem
When using the built-in gRPC remote cache (Global / remoteCache := Some(uri("grpc://...")))
with bazel-remote as the CAS server, builds intermittently fail with:
[error] java.io.FileNotFoundException: .../target/out/value/sha256-/.json (No such file or directory)
22:58:06 �[0m[�[0m�[31merror�[0m] �[0m�[0mjava.io.FileNotFoundException: /home/jenkins/agent/workspace/I_cap-commons_feature_sbt2_final/target/out/value/sha256-16b04cae19175fe44daff3c3be06dca3fa7af838a6f85add49fce6ae2afcb80d/48.json (No such file or directory)�[0m
failure_log_during_publish.txt
This happens when the remote cache server reports an AC (Action Cache) hit
but the corresponding CAS blob is missing or was evicted (possibly corrupted?).
Rather than treating this as a cache miss and recomputing the task, SBT propagates the
exception and fails the build.
The failure is intermittent — it disappears on retry without any code changes,
confirming it is a transient infrastructure issue rather than a build logic error.
Steps to reproduce
Background:
Java Azul JVM 17.0.18
SBT version 2.13.16
Running on Jenkins CI environment
The project is a large monorepo consisting of several projects some of which are libraries while other are microservices with deep interdependency between each other.
- Configure a remote cache backed by bazel-remote (running on a pod - the specific settings can be provided if needed)
- Evict or corrupt a CAS entry while leaving the AC entry intact
- Run a build —
java.io.FileNotFoundExceptionis thrown fromActionCache
Expected behavior
SBT should treat a missing/unreadable CAS blob as a cache miss and recompute
the task locally, just as it does for other cache miss scenarios.
Notes
I've identified three unguarded syncBlobs call sites in ActionCache.scala
and prepared a fix with regression tests. Will open a PR alongside this issue.
Issue flow
flowchart TD
A([SBT task evaluation]) --> B{Check local\nsymlink\nfast-path}
B -- symlink exists --> C[readFromSymlink\nRead value JSON]
B -- no symlink --> D[findActionResult\nQuery Action Cache]
C --> E{AC hit?}
E -- yes --> F[syncBlobs\nfast-path]
E -- no --> D
D --> G{AC hit?}
G -- no --> K([organicTask\nCompute locally])
G -- yes --> H{Value inline\nor via blob?}
H -- inline --> I[syncBlobs\noutput files only]
H -- blob --> J[syncBlobs\nread value from path]
subgraph BEFORE ["❌ Before fix — unguarded"]
F -- FileNotFoundException --> ERR1([💥 Build FAILS])
I -- FileNotFoundException --> ERR2([💥 Build FAILS])
J -- FileNotFoundException --> ERR3([💥 Build FAILS])
end
subgraph AFTER ["✅ After fix — NonFatal catch"]
F -- NonFatal catch --> MISS1[Returns None\ncache miss]
I -- NonFatal catch --> MISS2[Returns Left-None\ncache miss]
J -- NonFatal catch --> MISS2
MISS1 --> D
MISS2 --> K
end
K --> L[Run action\ncalled++]
L --> M[store.put\nWrite AC entry]
M --> N[syncBlobs\nWrite output files]
subgraph ORGANIC ["organicTask — also guarded"]
N -- NonFatal catch --> LOG[Debug log\nSkipping cache storage]
LOG --> OK
end
N -- success --> OK([✅ Return result])
style ERR1 fill:#ff4d4d,color:#fff
style ERR2 fill:#ff4d4d,color:#fff
style ERR3 fill:#ff4d4d,color:#fff
style BEFORE fill:#fff0f0,stroke:#ff4d4d
style AFTER fill:#f0fff0,stroke:#2ecc71
style ORGANIC fill:#f0f8ff,stroke:#3498db
style OK fill:#2ecc71,color:#fff