Skip to content

Make chmod hermetic#2024

Closed
mattyclarkson wants to merge 1 commit intobazel-contrib:mainfrom
mattyclarkson:chmod-touch-id
Closed

Make chmod hermetic#2024
mattyclarkson wants to merge 1 commit intobazel-contrib:mainfrom
mattyclarkson:chmod-touch-id

Conversation

@mattyclarkson
Copy link
Contributor

Removes the need for system binaries to make the Python interpreter library directories/files read-only.

Fixes #2016

@mattyclarkson mattyclarkson force-pushed the chmod-touch-id branch 2 times, most recently from 5830642 to 7a71c56 Compare July 1, 2024 15:21
@mattyclarkson mattyclarkson changed the title fix: make Python repositories chmod/id/touch` hermetic fix: make Python repositories chmod/id/touch hermetic Jul 1, 2024
if "windows" not in platform:
lib_dir = "lib" if "windows" not in platform else "Lib"

python = _get_host_python_interpreter(rctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the python_bin var that is defined below instead of using the host_python. And we should do the read-only fix only if the host platform the same as the platform of the toolchain.

Could we recursively change this for windows as well? Right now we are calling this only if it is not Windows, but do the scripts work on Windows?

Copy link
Contributor Author

@mattyclarkson mattyclarkson Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.chmod says:

Although Windows supports chmod(), you can only set the file’s read-only flag with it (via the stat.S_IWRITE and stat.S_IREAD constants or a corresponding integer value). All other bits are ignored.

So, yes, it would remove write access on Windows.

os.getuid is not available on Windows but I can find a fallback for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chmod.py does set the folders and files to read-only on Windows, but does not prevent new files being created in the directories. So the same behaviour as Unix is not present. We would have to look into Windows permissions to restrict that.

The getuid could return the Window Security ID for the user but that doesn't really help much. I've removed the touch check by checking platform.

We now run the scripts on Windows, but they don't really do much and actually fail to prevent .pyc files being written. Which is no different to before this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can prevent writing files into directories on Windows with security permissions. That could be done by calling out to icacls.exe (in leiu of the win32security dependency). That will likely take significant extra testing as I'm not a Windows expert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aignas would you like me to continue figuring out truely read-only directories on Windows or does this PR move the needle enough to be considered for merge? I'd be happy to raise follow up issues/PRs, if necessary.

@mattyclarkson mattyclarkson force-pushed the chmod-touch-id branch 2 times, most recently from e9b91f5 to 126abef Compare July 2, 2024 09:27
@aignas
Copy link
Collaborator

aignas commented Jul 17, 2024

@rickeylev, what do you think about this? How do we move with this forward? Should we just start ignoring the pyc files in the repo and not bother with making it read-only?. The making of the external repos read-only is something that we don't do in the whl_library (maybe we should) and I am wondering what is the overall strategy here...

I am thinking that having the chmod, id and touch be hermetic might be ignoring the overall picture of what we are trying to do here.

@rickeylev
Copy link
Collaborator

what is the overall strategy here

So, correctness wise, there are two things we're constrained by:

  1. The input files (both content and the list of them) have to be stable.
  2. The downloaded python binary may not be runnable by the host that downloaded it, such as in a cross-build with RBE situation.

(1) can be accomplished by any of:

  • a. Making everything read only.
  • b. Generating all the necessary files at repo install time.
  • c. Excluding files that might be created at runtime

I think we should do (a) regardless -- its just defense in depth. For the case of Windows (or other failures) we should just be OK with leaving things writable. It's the best we can do, and users will just bypass it entirely with ignore_root_error=True anyways (they are given the choice between "guaranteed pedantic failure" and "probably works, but flaky"; they'll always choose the latter).

I think (b) is better than (c), but it's out of scope for this PR and we already have (c) in place.

(2) throws a wrench into what this PR is doing -- using the downloaded runtime itself to run commands to make things read only. So I'm not really keen on that overall. I don't mind having an alternative implementation of chmod for OP's case where coreutils aren't available. Ideally, we'd either have a repo-phase reference to a host-compatible Python, or a repo-phase reference to a host-compatible chmod tool.

So, more concretely, I ask for the following changes:

  • Use the system chmod if available, otherwise use the downloaded-python's chmod. If neither is available, oh well; continue on.
  • Remove the fail() code paths. Just chmod and trust that things are OK.

mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
@mattyclarkson mattyclarkson changed the title fix: make Python repositories chmod/id/touch hermetic Make chmod hermetic Aug 1, 2024
@mattyclarkson mattyclarkson marked this pull request as draft August 1, 2024 11:35
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazel-contrib#2024 (comment)
@mattyclarkson mattyclarkson marked this pull request as ready for review August 1, 2024 13:14
@mattyclarkson
Copy link
Contributor Author

I've updated to use uutils/coreutils and removed the fail paths.

logger = logger,
).return_code == 0

def _chmod(rctx, platform = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speculation, that needs to be measured: The way this is implemented it would fetch and extract the chmod binary for each python version. It would use the bazel repository cache, but having the chmod in a separate repository might be better from the performance point of view?

In general, using the chmod that is hermetic is an interesting idea, but I wonder if we should use the system chmod if it exists and if it doesn't, download the hermetic chmod. That would also satisfy the ask to support your usecase where system chmod is not available. All of this could be done in a separate host_chmod repository, we could even switch the behaviour to default to the hermetic chmod if an env var is set to a particular value.

@mattyclarkson
Copy link
Contributor Author

I'm keen to work through getting this landed. I can rework the branch to current main.

Could I get some hints on how to implement this so that it has the best chance to get accepted, please?

@aignas
Copy link
Collaborator

aignas commented Oct 24, 2025

I think there are multiple options here:

option 1 - use this chmod

In this case I think it would be best to create a separate repository rule maybe something called host_coreutils and then it will be very clear that we are trying to download the host-specific coreutils version.

We register this somewhere in internal_deps.bzl and then use it from the hermetic toolchain. It should be "used" internally as a dep in MODULE.bazel

option 2 - trust our glob and remove chmod

Remove the chmoding. This would solve the following:

  • Delete the bazel cache requires a sudo, because the dirs are read-only
  • Difference in what we do on Windows vs Linux today
  • Need to chmod would be no-longer there

Drawback:

  • Theoretical polution with pyc files if we don't use our python helpers to execute Python scripts in the repository context. Should we document that users should use -B?

option 3 - do nothing

Requires on system chmod, which is undesirable as you have suggested by creating the PR.

proposal

I would actually propose option 2, @rickeylev, @mattyclarkson, what do you think?

cc: @arrdem

@mattyclarkson
Copy link
Contributor Author

Option 2 is fine with me. @rickeylev?

@rickeylev
Copy link
Collaborator

Option 2 SGTM

@arrdem
Copy link
Contributor

arrdem commented Nov 11, 2025

Option 1 seems bad, option 3 I don't follow the downsides of. I'm leaning 2 by default but I don't feel like I have an informed opinion here.

@aignas
Copy link
Collaborator

aignas commented Nov 12, 2025

So I'd propose us to go with the option 2 moving forward.

github-merge-queue bot pushed a commit that referenced this pull request Nov 23, 2025
As discussed in the #2024 messaged we decided to remove the chmoding and
simplify the setup for all of our users. This in effect unifies how the
toolchains are created across all of the platforms reducing the need for
some
of the integration tests.

Summary:
- `python_repository`: stop chmoding
- `python_repository`: make `ignore_root_user_error` noop
- `python(bzlmod)`: stop using `ignore_root_user_error`.
- `tests`: remove `ignore_root_user_error` tests.

Fixes #2016
Fixes #2053
Closes #2024
@aignas aignas closed this in #3421 Nov 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hermetic chmod/touch/id in python_repository

4 participants