Add strip_components to extract/download_and_extract `http_arch…#29281
Add strip_components to extract/download_and_extract `http_arch…#29281willstranton wants to merge 3 commits intobazelbuild:masterfrom
strip_components to extract/download_and_extract `http_arch…#29281Conversation
da36a7a to
613bd88
Compare
If the source archive URL is deterministic, the exact prefix should be known? |
Yes, that's true, but it's inconvenient to have to examine an archive to determine that exact prefix. This pull request is a "quality of life" improvement. As you point out, it's not a "must have". Summarizing from the community:
I remember having to update dependencies manually before BCR. You had to update the tar archive AND the prefix that was stripped.
|
|
OK, thanks for the context! If we do this, we should also backport this to Bazel 8 & 9, so that modules can keep the compatibility with multiple LTS releases when using this feature. |
There was a problem hiding this comment.
Pull request overview
Adds a new strip_components integer attribute/parameter (similar to tar --strip-components) to http_archive, repository_ctx.download_and_extract, and repository_ctx.extract, enabling prefix stripping without knowing an exact directory name.
Changes:
- Introduces
strip_componentsplumbing from Starlark (http_archive,download_and_extract,extract) down to the Java decompressor layer. - Implements component stripping during extraction for
.zip,.7z, and tar-based archives. - Adds/updates integration + unit tests covering component stripping and rename-ordering behavior.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/build_defs/repo/http.bzl | Adds strip_components attr, enforces mutual exclusivity with strip_prefix, passes through to download_and_extract. |
| src/main/java/com/google/devtools/build/lib/vfs/PathFragment.java | Adds PathFragment.stripComponents(int) utility used by decompressors. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/starlark/StarlarkBaseExternalContext.java | Adds strip_components params to download_and_extract/extract and wires into decompression. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/decompressor/DecompressorDescriptor.java | Adds stripComponents field + builder validation for mutual exclusivity with prefix. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/decompressor/ZipDecompressor.java | Applies component stripping to zip entry paths before extraction. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/decompressor/SevenZDecompressor.java | Applies component stripping to 7z entry paths before extraction. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/decompressor/CompressedTarFunction.java | Applies component stripping to tar entry paths before extraction. |
| src/main/java/com/google/devtools/build/lib/bazel/repository/decompressor/CompressedFunction.java | Updates docs to note stripComponents is ignored for single-file compressor formats. |
| src/test/shell/bazel/external_integration_test.sh | Adds http_archive integration coverage for strip_components (tar/zip + add_prefix). |
| src/test/java/com/google/devtools/build/lib/vfs/PathFragmentTest.java | Adds unit tests for PathFragment.stripComponents. |
| src/test/java/com/google/devtools/build/lib/bazel/repository/starlark/StarlarkBaseExternalContextTest.java | Updates test calls for new downloadAndExtract signature. |
| src/test/java/com/google/devtools/build/lib/bazel/repository/decompressor/ZipDecompressorTest.java | Adds zip decompression tests for strip_components (+ rename ordering). |
| src/test/java/com/google/devtools/build/lib/bazel/repository/decompressor/SevenZDecompressorTest.java | Adds 7z decompression tests for strip_components (+ rename ordering + strip-all). |
| src/test/java/com/google/devtools/build/lib/bazel/repository/decompressor/CompressedTarFunctionTest.java | Adds tar.gz decompression tests for strip_components (+ rename ordering). |
| src/test/tools/bzlmod/MODULE.bazel.lock | Updates lockfile digests due to test/module changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks, please run |
|
Let me know this is fixed, I will add the import label |
…ive` The `strip_components` attribute functions similar to tar --strip-components: > Strip NUMBER leading components from file names on extraction. This is an alternative to the existing `strip_prefix` attribute, which required knowing the exact prefix to be stripped. Only one of the two attributes (`strip_prefix`, `strip_components`) can be set at one time. Fixes bazelbuild#28879 RELNOTES[NEW]: Adds the `strip_components` attribute to `extract`/`download_and_extract`/`http_archive` to allow stripping of path components when extracting files.
CI now passing. |
|
@bazel-io fork 8.7.0 |
|
@bazel-io fork 9.1.0 |
This PR updates the documentation for `http_archive` to include the new `strip_components` attribute and clarifies the mutual exclusivity between `strip_prefix` and `strip_components`. Fixes bazelbuild#29281
This PR updates the documentation for `http_archive` to include the new `strip_components` attribute and clarifies the mutual exclusivity between `strip_prefix` and `strip_components`. Fixes bazelbuild#29281
This PR updates the documentation for the `http_archive` rule to include the newly added `strip_components` attribute. This attribute allows users to strip a specified number of leading path components from extracted files, offering an alternative to `strip_prefix`. Additionally, the documentation for `strip_prefix` has been updated to clarify that only one of `strip_prefix` or `strip_components` can be used. Original PR: bazelbuild#29281
This PR documents the new `strip_components` attribute for `http_archive` and the `strip_components` parameter for `ctx.download_and_extract` and `ctx.extract`. It also clarifies that `strip_prefix` and `strip_components` are mutually exclusive. This documentation is sourced from the code changes in bazelbuild#29281.
This PR documents the new `strip_components` attribute for `http_archive` and the `strip_components` parameter for `ctx.download_and_extract` and `ctx.extract`. It also clarifies that `strip_prefix` and `strip_components` are mutually exclusive. This documentation is sourced from the code changes in bazelbuild#29281.
… .extract (placeholder) This PR introduces `strip_components` to `repository_ctx.download_and_extract` and `repository_ctx.extract` functions, allowing users to specify the number of leading path components to strip during extraction. This new attribute is mutually exclusive with `strip_prefix`. Original PR: bazelbuild#29281 **Note**: Due to persistent issues in programmatically identifying the correct documentation files within the Bazel repository (404 errors on multiple plausible paths, search API rate limits, and incomplete `list_docs_in_repo` results), the changes have been applied to placeholder files. A manual review is required to integrate this content into the definitive documentation for `repository_ctx` methods.
This PR introduces the `strip_components` attribute for `http_archive` and similar repository rules, allowing users to strip a specified number of leading path segments from extracted archives. It also clarifies that `strip_components` and `strip_prefix` are mutually exclusive. See original PR: bazelbuild#29281
This PR adds documentation for the new `strip_components` attribute for `repository_ctx.download_and_extract` and `repository_ctx.extract`, introduced in bazelbuild#29281. Since the original documentation file for `repository_ctx` could not be located, this PR creates new, minimal documentation files at the location suggested by broken links in the existing documentation.
…ion methods This PR updates the documentation for repository rules to reflect the addition of the `strip_components` attribute to `http_archive`, `repository_ctx.extract()`, and `repository_ctx.download_and_extract()`. The `http_archive` example now includes `strip_components`, and a note has been added to the `repository_ctx` section clarifying the use of `strip_components` and its mutual exclusivity with `strip_prefix` for extraction functions. This update corresponds to the changes in bazelbuild#29281.
This PR updates the documentation for `http_archive` to include the new `strip_components` attribute, as introduced in bazelbuild#29281. It also clarifies that `strip_components` and `strip_prefix` are mutually exclusive.
|
@willstranton Can you look into backporting this to 9.x and perhaps 8.x? The auto cherry-pick process failed #29323 (comment) |
bazelbuild#29281) Add `strip_components` to `extract`/`download_and_extract` `http_archive` ### Description The `strip_components` attribute functions similar to `tar --strip-components`: > Strip NUMBER leading components from file names on extraction. This is an alternative to the existing `strip_prefix` attribute, which required knowing the exact prefix to be stripped. Only one of the two attributes (`strip_prefix`, `strip_components`) can be set at one time. ### Motivation See bazelbuild#28879 ### Build API Changes > 1. Has this been discussed in a design doc or issue? (Please link it) See bazelbuild#28879 > 2. Is the change backward compatible? Yes > 3. If it's a breaking change, what is the migration plan? N/A - this is not a breaking change. ### Checklist - [X] I have added tests for the new use cases (if any). - [X] I have updated the documentation (if applicable). ### Release Notes RELNOTES[NEW]: Adds the `strip_components` attribute to `extract`/`download_and_extract`/`http_archive` to allow stripping of path components when extracting files. Closes bazelbuild#29281 PiperOrigin-RevId: 902961227 Change-Id: I3fda77ec42c3d052f6655e42c8b57ec27667c758
…ttp_arch… (bazelbuild#29367) …… (bazelbuild#29281) Add `strip_components` to `extract`/`download_and_extract` `http_archive` ### Description The `strip_components` attribute functions similar to `tar --strip-components`: > Strip NUMBER leading components from file names on extraction. This is an alternative to the existing `strip_prefix` attribute, which required knowing the exact prefix to be stripped. Only one of the two attributes (`strip_prefix`, `strip_components`) can be set at one time. ### Motivation See bazelbuild#28879 ### Build API Changes > 1. Has this been discussed in a design doc or issue? (Please link it) See bazelbuild#28879 > 2. Is the change backward compatible? Yes > 3. If it's a breaking change, what is the migration plan? N/A - this is not a breaking change. ### Checklist - [X] I have added tests for the new use cases (if any). - [X] I have updated the documentation (if applicable). ### Release Notes RELNOTES[NEW]: Adds the `strip_components` attribute to `extract`/`download_and_extract`/`http_archive` to allow stripping of path components when extracting files. Closes bazelbuild#29281. PiperOrigin-RevId: 902961227 Change-Id: I3fda77ec42c3d052f6655e42c8b57ec27667c758
…ttp_arch… (bazelbuild#29369) …… (bazelbuild#29281) Add `strip_components` to `extract`/`download_and_extract` `http_archive` ### Description The `strip_components` attribute functions similar to `tar --strip-components`: > Strip NUMBER leading components from file names on extraction. This is an alternative to the existing `strip_prefix` attribute, which required knowing the exact prefix to be stripped. Only one of the two attributes (`strip_prefix`, `strip_components`) can be set at one time. ### Motivation See bazelbuild#28879 ### Build API Changes > 1. Has this been discussed in a design doc or issue? (Please link it) See bazelbuild#28879 > 2. Is the change backward compatible? Yes > 3. If it's a breaking change, what is the migration plan? N/A - this is not a breaking change. ### Checklist - [X] I have added tests for the new use cases (if any). - [X] I have updated the documentation (if applicable). ### Release Notes RELNOTES[NEW]: Adds the `strip_components` attribute to `extract`/`download_and_extract`/`http_archive` to allow stripping of path components when extracting files. Closes bazelbuild#29281. PiperOrigin-RevId: 902961227 Change-Id: I3fda77ec42c3d052f6655e42c8b57ec27667c758
Add
strip_componentstoextract/download_and_extracthttp_archiveDescription
The
strip_componentsattribute functions similar totar --strip-components:This is an alternative to the existing
strip_prefixattribute, which required knowing the exact prefix to be stripped. Only one of the two attributes (strip_prefix,strip_components) can be set at one time.Motivation
See #28879
Build API Changes
See #28879
Yes
N/A - this is not a breaking change.
Checklist
Release Notes
RELNOTES[NEW]: Adds the
strip_componentsattribute toextract/download_and_extract/http_archiveto allow stripping of path components when extracting files.