Consider a tree artifact containing a symlink to a directory, as e.g. would be produced by the following action:
d = ctx.actions.declare_directory(...)
ctx.actions.run_shell(
outputs = [d],
command = "cd $1 && mkdir dir && touch dir/file.txt && ln -s dir sym",
arguments = [d.path],
)
Currently, this will result in ActionOutputMetadataStore#constructTreeArtifactValueFromFilesystem collecting a TreeArtifactValue with two entries: dir/file.txt, mapping to a FileArtifactValue, and sym, mapping to a DirectoryArtifactValue.
In addition to DirectoryArtifactValue being unsound, the mismatch between the "artifact type" and "metadata type" is a source of bugs (#20415 is a recent example). I think it would be better for the symlink to be transparently followed (i.e., collect a TreeFileArtifact+FileArtifactValue for sym/file.txt).
One thing we need to be careful about is that the symlink could point outside of the tree artifact (which is explicitly permitted), and in this case, while we still want to collect the metadata, we should avoid calling chmod on the subtree. (Alternatively, we could disallow that form of symlink, but I suspect our hands might already be tied by existing use cases.)
Consider a tree artifact containing a symlink to a directory, as e.g. would be produced by the following action:
Currently, this will result in
ActionOutputMetadataStore#constructTreeArtifactValueFromFilesystemcollecting aTreeArtifactValuewith two entries:dir/file.txt, mapping to aFileArtifactValue, andsym, mapping to aDirectoryArtifactValue.In addition to
DirectoryArtifactValuebeing unsound, the mismatch between the "artifact type" and "metadata type" is a source of bugs (#20415 is a recent example). I think it would be better for the symlink to be transparently followed (i.e., collect aTreeFileArtifact+FileArtifactValueforsym/file.txt).One thing we need to be careful about is that the symlink could point outside of the tree artifact (which is explicitly permitted), and in this case, while we still want to collect the metadata, we should avoid calling
chmodon the subtree. (Alternatively, we could disallow that form of symlink, but I suspect our hands might already be tied by existing use cases.)