Skip to content

Duplicate file paths not accepted by pkg_tar #849

@eejayes

Description

@eejayes

When tar archives which contain duplicate file paths are assigned to deps of pkg_tar, the function removes all but the first occurring instance, and warns with "Duplicate file in archive picking first occurrence". This contrasts with the common GNU tar utility, where "tar allows you to have infinite number of files with the same name" according to document for --append under The Five Advanced tar Operations, where unpacking of the archive will yield the last occurring instance of the file path.

My use case involves doing tar archive augmentation through a series of build steps, where one of these operations is to overwrite files of an existing archive. It is not an option to unpack the archives, one over the other, and then repack the result, because new entries for parent directories of the files in the incoming archives will be created which state their permissions. This is because the state of the directories on the extraction target must be preserved, rather than set to those which were present in the environment where the augmentation took place. So if an archive with duplicate files is presented to pkg_tar where the one most recently added is meant to overwrite, it is actually the original, or first file, which is retained, and thus when unpacking the overwriting behavior is not achieved.

I feel that pkg_tar's behavior in this case should be compatible with GNU tar, in order to support interoperability and leverage from common understanding that it has probably established.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions