Skip to content

Conversation

@AleksanderGondek
Copy link

This change allows the repo_contents_cache to be set to a path within the current WORKSPACE, as long as said path is excluded from Bazel evaluation via .bazelignore entry.

Closes #26384

@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Jul 10, 2025
@iancha1992 iancha1992 added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label Jul 10, 2025
@AleksanderGondek AleksanderGondek force-pushed the fix/allow-repo-contents-cache-in-workspace-if-bazelignored branch from 258d469 to f473eed Compare July 13, 2025 17:05
@Wyverald
Copy link
Member

Could you add a test for this? / Have you verified this works?

One way to make sure is to remove this line: https://cs.opensource.google/bazel/bazel/+/master:scripts/bootstrap/bootstrap.sh;drc=e6979d3f878ddd663f15a10b09407ecbb9380fee;l=42 (or set it to some other value), create a .bazelignore file around here to exclude the repo contents cache dir, and make sure //src/test/shell/bazel:bazel_bootstrap_distfile_test passes.

@AleksanderGondek AleksanderGondek force-pushed the fix/allow-repo-contents-cache-in-workspace-if-bazelignored branch 2 times, most recently from bef7328 to 0ee146d Compare July 15, 2025 13:40
@AleksanderGondek
Copy link
Author

AleksanderGondek commented Jul 15, 2025

I have given a shot at implementing dedicated test for this behavior (allowing repo_contents_cache when it's set to path inside WORKSPACE and mentioned in .bazelignore; bailing out if its set to same path but not .bazelignored.

I did not follow Your particular suggestion of testing, because I felt like it would not be as clean and direct as dedicated check. Of course, if there is something I missed / not considered, I will be very happy to fix / change that.

I have also tested the changes by running the binary built from this branch, on my WORKSPACES that use the pattern of output_user_root being placed within the repo itself (various flavors of Linux x86-64).

@AleksanderGondek AleksanderGondek force-pushed the fix/allow-repo-contents-cache-in-workspace-if-bazelignored branch from 0ee146d to 08e6c15 Compare July 18, 2025 07:05
This change allows the `repo_contents_cache` to be set to a path
within the current WORKSPACE, as long as said path is excluded
from Bazel evaluation via `.bazelignore` entry.
This changeset aims to validate the expected behavior
of `--repo_contents_cache` when set to path within
current WORKSPACE; namely to fail if it is not
.bazelignored and to work if it is.
@AleksanderGondek AleksanderGondek force-pushed the fix/allow-repo-contents-cache-in-workspace-if-bazelignored branch from 08e6c15 to e7aacc7 Compare July 18, 2025 08:56
@AleksanderGondek
Copy link
Author

Hey @Wyverald ,
I do not mean to be inconsiderate of Your time, so please treat this as a very soft ping :)
May You please help me with getting this change reviewed or would You recommend using [email protected] ?

@Wyverald
Copy link
Member

Sorry about the delay! Thanks for adding that test; however, what I'm worried about is not the code you added not functioning correctly, but rather whether --repo_contents_cache in a .bazelignored directory actually works. That's why I suggested the bootstrap test change -- that's actually where I first discovered that --repo_contents_cache in the main repo might not work.

@AleksanderGondek
Copy link
Author

Thanks for explanation - I will move the PR into draft mode and ensure things work as they should before re-opening.

@AleksanderGondek AleksanderGondek marked this pull request as draft July 26, 2025 16:57
rdesgroppes added a commit to DataDog/datadog-agent that referenced this pull request Sep 2, 2025
### What does this PR do?
Now `bazelisk` is available on all our CI executors
([macOS](DataDog/ci-platform-machine-images#395)
runners,
[Linux](DataDog/datadog-agent-buildimages#951) &
[Windows](DataDog/datadog-agent-buildimages#953)
containers), this change secures an initial part of our `bazel` setup
while providing reusable job templates to ease caching in CI.

Practically, it adds a handful of jobs focusing on **building `bazel`
dependencies** in the corresponding GitLab `deps_build` stage.

Overall, it consists in:
1. verifying **`bazelisk` properly bootstraps `bazel` across all
platforms**, (primary scope of the PR)
2. extracting initial `bazel:`-prefixed CI job templates and dogfooding
them, (i.e. by building deps)
3. binding `bazelisk`/`bazel` caches to GitLab caching capabilities
(runner-based as of now): installed binaries, "repository cache", "repo
contents cache", and "disk cache" are all saved/restored correctly
to/from GitLab while honoring OS/architecture boundaries,
4. marking in-workspace cache in both `.bazelignore` and `.gitignore`,
5. adjusting code/job ownership accordingly.

### Motivation
Securing some initial `bazel` configuration in CI :
- ensures all platforms are kept in the radar over next iterations,
- prevents regressions over next iterations,
- establishes foundations for future jobs.

### Possible Drawbacks / Trade-offs
- on Windows, compilation errors made be descope dependencies being
built to **only `bzip2`**. This will be of course addressed in a
subsequent PR,
- GitLab caching is still runner-based for the time being, which can be
revisited later. (like everything)

### Additional Notes
Main addressed challenges:
- cache location conflicts[^1]: externalized "repo contents cache"
through `tools/bazel*` wrappers:
  - bazelbuild/bazel#26384
  - bazelbuild/bazel#26522
  - bazelbuild/bazel#26773
  - bazelbuild/bazel#26802
- macOS runners:
  - `bzip2` dependency: fetch from a reachable source:
    - [x] #40219
- Windows:
- `bazel` spawn strategy: fallback to a permissive strategy since
`sandboxed` is unsupported
    - [x] #40328
- `tools/bazel` wrapper: fallback to batch (`tools/bazel.bat`), since
`bash` is discouraged by `bazel` in this case and `tools/bazel.ps1`
poses detection problems,
  - long paths were supported in containers but not on runners:
    - [x] DataDog/ci-platform-machine-images#429
  - recursive symlinks: use `robocopy` instead of `copy`/`move`/`xcopy`.
  
[^1]:
> ERROR: The repo contents cache
[/path/to/datadog-agent/.cache/repo_contents] is inside the workspace
[/path/to/datadog-agent]. This can cause spurious failures. Disable the
repo contents cache with `--repo_contents_cache=`, or specify
`--repo_contents_cache=<path outside the workspace>`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-review PR is awaiting review from an assigned reviewer team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In-project --output_user_root triggers the repo contents cache error, even though the target directory is excluded with .bazelignore

3 participants