Skip to content

[9.1.0] Support remote cache CDC#28903

Merged
iancha1992 merged 1 commit intobazelbuild:release-9.1.0from
tyler-french:tfrench/cdc-9.1.0
Mar 25, 2026
Merged

[9.1.0] Support remote cache CDC#28903
iancha1992 merged 1 commit intobazelbuild:release-9.1.0from
tyler-french:tfrench/cdc-9.1.0

Conversation

@tyler-french
Copy link
Copy Markdown
Contributor

This PR enables support for content-defined chunking (FastCDC) for large uploads/downloads to remote cache. See #28437 for more details.

RELNOTES[NEW]: Added --experimental_remote_cache_chunking flag to read and write large blobs to/from the remote cache in chunks. Requires server support.

@tyler-french tyler-french requested a review from a team as a code owner March 5, 2026 18:42
@github-actions github-actions Bot added team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels Mar 5, 2026
Copy link
Copy Markdown

@shameelvk9-png shameelvk9-png left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@iancha1992 iancha1992 added this to the 9.1.0 release blockers milestone Mar 5, 2026
@iancha1992 iancha1992 requested a review from tjgq March 5, 2026 21:43
@iancha1992
Copy link
Copy Markdown
Member

Shouldn't we merge the changes to the master first? @tjgq @tyler-french

@tyler-french
Copy link
Copy Markdown
Contributor Author

Shouldn't we merge the changes to the master first? @tjgq @tyler-french

Yes. I just created these PRs since there were merge conflicts and the main PR is approved

@iancha1992
Copy link
Copy Markdown
Member

iancha1992 commented Mar 12, 2026

We'll merge these changes after #28437 is merged first. And then we'll cherry-pick it from there. You'll still be acknowledged in the release notes for 9.1.0 even if the cherry-pick bot creates the PR for the cherry-pick. We appreciate your work on #28437. Thank you so much!

@iancha1992 iancha1992 closed this Mar 12, 2026
@github-actions github-actions Bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Mar 12, 2026
@tyler-french
Copy link
Copy Markdown
Contributor Author

We'll merge these changes after #28437 is merged first. And then we'll cherry-pick it from there. You'll still be acknowledged in the release notes for 9.1.0 even if the cherry-pick bot creates the PR for the cherry-pick. We appreciate your work on #28437. Thank you so much!

@iancha1992 Can the bot resolve merge conflicts? I only created the PRs because I expect the cherry-pick to fail without manual intervention

@iancha1992
Copy link
Copy Markdown
Member

We'll merge these changes after #28437 is merged first. And then we'll cherry-pick it from there. You'll still be acknowledged in the release notes for 9.1.0 even if the cherry-pick bot creates the PR for the cherry-pick. We appreciate your work on #28437. Thank you so much!

@iancha1992 Can the bot resolve merge conflicts? I only created the PRs because I expect the cherry-pick to fail without manual intervention

Nice! Then we'll reopen this and merge it if there's a conflict. @tyler-french

copybara-service Bot pushed a commit to bazelbuild/intellij that referenced this pull request Mar 17, 2026
**TLDR: This PR enables support for content-defined chunking (FastCDC) for large uploads/downloads to remote cache, saving ~40% on storage and upload bandwidth, and making builds faster by deduplicating similar artifacts across builds.**

RELNOTES[NEW]: Added `--experimental_remote_cache_chunking` flag to read and write large blobs to/from the remote cache in chunks. Requires server support.

## Motivation

Actions like `GoLink` and `CppLink` produce very large output files that are often similar between builds. A small source change can cause a cache miss, wasting storage, bandwidth, and time on nearly-identical artifacts.

Content-Defined Chunking (CDC) addresses this by splitting files at content-determined cut points. Because cut points are derived from the file content itself, small changes, even ones that shift bytes around, tend to affect only a few chunks. This makes action outputs effectively incremental: even though the action must re-run, the upload, download, and storage costs shrink dramatically.

## Results

Benchmarked across the last 50 commits of the BuildBuddy repo (server and client on the same host):

| Scenario | Upload | Download | RPCs | Disk Cache | Avg Build Time |
|---|---|---|---|---|---|
| chunking + disk cache | 52.0 GB | 0 B | 626K | 146.6 GB | 55s |
| chunking, no disk cache | 49.2 GB | 343.2 GB | 4.1M | — | 54s |
| no chunking + disk cache | 85.6 GB | 0 B | 273K | 246.5 GB | 100s |
| no chunking, no disk cache | 89.7 GB | 343.8 GB | 2.5M | — | 97s |

Key takeaways:
- **~40% less data uploaded** (52 GB vs 90 GB)
- **~40% smaller disk cache** (147 GB vs 247 GB)
- Download size is mostly unchanged (~0.2% increase) because we don't yet store downloaded chunks in the output base. Using a disk cache is recommended for full benefit; output-base chunk reuse is planned.
- RPC count increases as expected since requests become smaller and more granular.
- **faster builds** (depends on conditions, like cache async, compression, & network speed)

Additional benefits: better load balancing across distributed clusters (fewer long-running RPCs) and more granular retries on unstable networks.

## Try It Out

Anyone can try chunking today using BuildBuddy:

1. Sign up for a free account at [buildbuddy.io](https://buildbuddy.io)
2. Get an API key with write access
3. Use the Bazel fork from bazelbuild/bazel#28903
4. Build!

```
USE_BAZEL_VERSION="tyler-french/9.1.0-cdc" bazel build //... \
  --experimental_remote_cache_chunking \
  --remote_header=x-buildbuddy-cdc-enabled=true \
  --remote_cache=grpcs://remote.buildbuddy.io
```

## How It Works

**Write path:**
1. Check if blob exceeds the chunking threshold.
2. Run FastCDC to compute chunk boundaries.
3. Call `FindMissingBlobs` to identify which chunks the server already has.
4. Upload only the missing chunks.
5. Call `SpliceBlob` to register the blob-to-chunks mapping on the server.

**Read path:**
1. Check if blob exceeds the chunking threshold.
2. Call `SplitBlob` to get the chunk list for this blob.
3. Download and reassemble the chunks.

If `--disk_cache` is enabled, previously downloaded chunks are served locally.

Closes #28437.

PiperOrigin-RevId: 885108353
copybara-service Bot pushed a commit that referenced this pull request Mar 17, 2026
**TLDR: This PR enables support for content-defined chunking (FastCDC) for large uploads/downloads to remote cache, saving ~40% on storage and upload bandwidth, and making builds faster by deduplicating similar artifacts across builds.**

RELNOTES[NEW]: Added `--experimental_remote_cache_chunking` flag to read and write large blobs to/from the remote cache in chunks. Requires server support.

## Motivation

Actions like `GoLink` and `CppLink` produce very large output files that are often similar between builds. A small source change can cause a cache miss, wasting storage, bandwidth, and time on nearly-identical artifacts.

Content-Defined Chunking (CDC) addresses this by splitting files at content-determined cut points. Because cut points are derived from the file content itself, small changes, even ones that shift bytes around, tend to affect only a few chunks. This makes action outputs effectively incremental: even though the action must re-run, the upload, download, and storage costs shrink dramatically.

## Results

Benchmarked across the last 50 commits of the BuildBuddy repo (server and client on the same host):

| Scenario | Upload | Download | RPCs | Disk Cache | Avg Build Time |
|---|---|---|---|---|---|
| chunking + disk cache | 52.0 GB | 0 B | 626K | 146.6 GB | 55s |
| chunking, no disk cache | 49.2 GB | 343.2 GB | 4.1M | — | 54s |
| no chunking + disk cache | 85.6 GB | 0 B | 273K | 246.5 GB | 100s |
| no chunking, no disk cache | 89.7 GB | 343.8 GB | 2.5M | — | 97s |

Key takeaways:
- **~40% less data uploaded** (52 GB vs 90 GB)
- **~40% smaller disk cache** (147 GB vs 247 GB)
- Download size is mostly unchanged (~0.2% increase) because we don't yet store downloaded chunks in the output base. Using a disk cache is recommended for full benefit; output-base chunk reuse is planned.
- RPC count increases as expected since requests become smaller and more granular.
- **faster builds** (depends on conditions, like cache async, compression, & network speed)

Additional benefits: better load balancing across distributed clusters (fewer long-running RPCs) and more granular retries on unstable networks.

## Try It Out

Anyone can try chunking today using BuildBuddy:

1. Sign up for a free account at [buildbuddy.io](https://buildbuddy.io)
2. Get an API key with write access
3. Use the Bazel fork from #28903
4. Build!

```
USE_BAZEL_VERSION="tyler-french/9.1.0-cdc" bazel build //... \
  --experimental_remote_cache_chunking \
  --remote_header=x-buildbuddy-cdc-enabled=true \
  --remote_cache=grpcs://remote.buildbuddy.io
```

## How It Works

**Write path:**
1. Check if blob exceeds the chunking threshold.
2. Run FastCDC to compute chunk boundaries.
3. Call `FindMissingBlobs` to identify which chunks the server already has.
4. Upload only the missing chunks.
5. Call `SpliceBlob` to register the blob-to-chunks mapping on the server.

**Read path:**
1. Check if blob exceeds the chunking threshold.
2. Call `SplitBlob` to get the chunk list for this blob.
3. Download and reassemble the chunks.

If `--disk_cache` is enabled, previously downloaded chunks are served locally.

Closes #28437.

PiperOrigin-RevId: 885108353
Change-Id: I395e2af68016cc2a9dc32ab546f680788d5c5c55
@iancha1992 iancha1992 reopened this Mar 17, 2026
@github-actions github-actions Bot added the awaiting-review PR is awaiting review from an assigned reviewer label Mar 17, 2026
@iancha1992
Copy link
Copy Markdown
Member

@tyler-french @tjgq Do we need to make changes to this PR as per #28437 (comment)?

tyler-french added a commit to tyler-french/bazel that referenced this pull request Mar 17, 2026
**TLDR: This PR enables support for content-defined chunking (FastCDC) for large uploads/downloads to remote cache, saving ~40% on storage and upload bandwidth, and making builds faster by deduplicating similar artifacts across builds.**

RELNOTES[NEW]: Added `--experimental_remote_cache_chunking` flag to read and write large blobs to/from the remote cache in chunks. Requires server support.

Actions like `GoLink` and `CppLink` produce very large output files that are often similar between builds. A small source change can cause a cache miss, wasting storage, bandwidth, and time on nearly-identical artifacts.

Content-Defined Chunking (CDC) addresses this by splitting files at content-determined cut points. Because cut points are derived from the file content itself, small changes, even ones that shift bytes around, tend to affect only a few chunks. This makes action outputs effectively incremental: even though the action must re-run, the upload, download, and storage costs shrink dramatically.

Benchmarked across the last 50 commits of the BuildBuddy repo (server and client on the same host):

| Scenario | Upload | Download | RPCs | Disk Cache | Avg Build Time |
|---|---|---|---|---|---|
| chunking + disk cache | 52.0 GB | 0 B | 626K | 146.6 GB | 55s |
| chunking, no disk cache | 49.2 GB | 343.2 GB | 4.1M | — | 54s |
| no chunking + disk cache | 85.6 GB | 0 B | 273K | 246.5 GB | 100s |
| no chunking, no disk cache | 89.7 GB | 343.8 GB | 2.5M | — | 97s |

Key takeaways:
- **~40% less data uploaded** (52 GB vs 90 GB)
- **~40% smaller disk cache** (147 GB vs 247 GB)
- Download size is mostly unchanged (~0.2% increase) because we don't yet store downloaded chunks in the output base. Using a disk cache is recommended for full benefit; output-base chunk reuse is planned.
- RPC count increases as expected since requests become smaller and more granular.
- **faster builds** (depends on conditions, like cache async, compression, & network speed)

Additional benefits: better load balancing across distributed clusters (fewer long-running RPCs) and more granular retries on unstable networks.

Anyone can try chunking today using BuildBuddy:

1. Sign up for a free account at [buildbuddy.io](https://buildbuddy.io)
2. Get an API key with write access
3. Use the Bazel fork from bazelbuild#28903
4. Build!

```
USE_BAZEL_VERSION="tyler-french/9.1.0-cdc" bazel build //... \
  --experimental_remote_cache_chunking \
  --remote_header=x-buildbuddy-cdc-enabled=true \
  --remote_cache=grpcs://remote.buildbuddy.io
```

**Write path:**
1. Check if blob exceeds the chunking threshold.
2. Run FastCDC to compute chunk boundaries.
3. Call `FindMissingBlobs` to identify which chunks the server already has.
4. Upload only the missing chunks.
5. Call `SpliceBlob` to register the blob-to-chunks mapping on the server.

**Read path:**
1. Check if blob exceeds the chunking threshold.
2. Call `SplitBlob` to get the chunk list for this blob.
3. Download and reassemble the chunks.

If `--disk_cache` is enabled, previously downloaded chunks are served locally.

Closes bazelbuild#28437.

PiperOrigin-RevId: 885108353
Change-Id: I395e2af68016cc2a9dc32ab546f680788d5c5c55
@tyler-french tyler-french changed the title [9.1.0] Support remote cache CDC WIP: [9.1.0] Support remote cache CDC Mar 18, 2026
@tyler-french tyler-french changed the title WIP: [9.1.0] Support remote cache CDC [9.1.0] Support remote cache CDC Mar 18, 2026
@tyler-french tyler-french changed the title [9.1.0] Support remote cache CDC wip [9.1.0] Support remote cache CDC Mar 18, 2026
@tyler-french tyler-french marked this pull request as draft March 18, 2026 08:03
@tyler-french tyler-french marked this pull request as ready for review March 18, 2026 16:59
@tyler-french tyler-french changed the title wip [9.1.0] Support remote cache CDC [9.1.0] Support remote cache CDC Mar 18, 2026
@iancha1992
Copy link
Copy Markdown
Member

@tjgq I think this is ready now. Could you please take a look? Thanks @tyler-french!

@iancha1992 iancha1992 enabled auto-merge March 18, 2026 17:37
@iancha1992 iancha1992 removed this from the 9.1.0 release blockers milestone Mar 18, 2026
@iancha1992 iancha1992 added this pull request to the merge queue Mar 25, 2026
Merged via the queue into bazelbuild:release-9.1.0 with commit 8b842c5 Mar 25, 2026
46 checks passed
@github-actions github-actions Bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team-Remote-Exec Issues and PRs for the Execution (Remote) team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants