Skip to content

Filenames containing commas lead to malformed RECORD files in wheels, and consequent "more than three elements" warnings on install #2052

@gimbo

Description

@gimbo

Issue

Summary: the wheel format's RECORD file (see PEP 376) is in CSV format; when poetry build builds the wheel for some package having a file with a comma in its name, the filename is not quoted, meaning the wheel's RECORD file contains a line with too many fields; this leads to a warning when pip-installing the package (pip then seems to add a fixed line to the version of RECORD written upon installation).

(I get that this is minor in that the only negative outcome — so far at least — is a warning when installing the package, but since I noticed it, I thought I'd bring it up...)

Detail

Here is a minimal repo demonstrating the issue.

Note the second line of RECORD:

poetry_record_bug_demo/a,b.txt,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0

This should be a 3-field CSV line, but due to the unquoted filename, it looks like 4 fields.

Here's what happens when we try to pip install this wheel, using pip v19.0.3:

Processing ./poetry_record_bug_demo-0.1.0-py3-none-any.whl
Installing collected packages: poetry-record-bug-demo
  RECORD line has more than three elements: ['poetry_record_bug_demo/a', 'b.txt', 'sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU', '0']
Successfully installed poetry-record-bug-demo-0.1.0

Note the warning; it does seem to install OK, however. We end up, in our venv, with a RECORD file looking like this (note final line):

poetry_record_bug_demo-0.1.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
poetry_record_bug_demo-0.1.0.dist-info/METADATA,sha256=2VyGnjGKozAicEGXFxpHD-PGu0mCDbB852S4wC6hD8c,480
poetry_record_bug_demo-0.1.0.dist-info/RECORD,,
poetry_record_bug_demo-0.1.0.dist-info/WHEEL,sha256=VN2eEg4PHbTOniXb-cCYQPc6zUTS4dAcATjlgU3q-Lo,83
poetry_record_bug_demo/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
poetry_record_bug_demo/__pycache__/__init__.cpython-37.pyc,,
poetry_record_bug_demo/a,b.txt,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
"poetry_record_bug_demo/a,b.txt",,

Location of bug in poetry source

The problematic code is here (in both v1.0.3 and current master):

    def _write_record(self, wheel):
        # Write a record of the files in the wheel
        with self._write_to_zip(wheel, self.dist_info + "/RECORD") as f:
            for path, hash, size in self._records:
                f.write("{},sha256={},{}\n".format(path, hash, size))
            # RECORD itself is recorded with no hash or size
            f.write(self.dist_info + "/RECORD,,\n")

If path contains more than zero commas, the first f.write() call results in a line containing more than three commas.

Similar bug already solved in wheel

Issue 280 in the wheel package is about exactly this; there's some interesting discussion in that issue. In particular, this comment refers to the RECORD section of PEP 376, which says, among other things:

Each record is composed of three elements:

It was solved for wheel in this commit, which uses an appropriately configured csv.Writer to do the heavy lifting (and in particular to quote filenames where necessary).

I suppose a similar solution could be implemented for poetry. (Also, perhaps displaying my ignorance about python packaging machinery, but I was actually surprised poetry doesn't use wheel to write wheels... but no doubt I'm missing something.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working as expected

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions