-
Notifications
You must be signed in to change notification settings - Fork 217
Duplicate file paths not accepted by pkg_tar #849
Description
When tar archives which contain duplicate file paths are assigned to deps of pkg_tar, the function removes all but the first occurring instance, and warns with "Duplicate file in archive picking first occurrence". This contrasts with the common GNU tar utility, where "tar allows you to have infinite number of files with the same name" according to document for --append under The Five Advanced tar Operations, where unpacking of the archive will yield the last occurring instance of the file path.
My use case involves doing tar archive augmentation through a series of build steps, where one of these operations is to overwrite files of an existing archive. It is not an option to unpack the archives, one over the other, and then repack the result, because new entries for parent directories of the files in the incoming archives will be created which state their permissions. This is because the state of the directories on the extraction target must be preserved, rather than set to those which were present in the environment where the augmentation took place. So if an archive with duplicate files is presented to pkg_tar where the one most recently added is meant to overwrite, it is actually the original, or first file, which is retained, and thus when unpacking the overwriting behavior is not achieved.
I feel that pkg_tar's behavior in this case should be compatible with GNU tar, in order to support interoperability and leverage from common understanding that it has probably established.