feat: perform a single hash validation of wheel files#8027
feat: perform a single hash validation of wheel files#8027ralbertazzi wants to merge 1 commit intopython-poetry:masterfrom
Conversation
|
I do not like the amount of pypa/installer that is being duplicated by poetry already, and this isn't helping! Are you sure you wouldn't prefer to fix up your MR in that project - I see that the pipeline is currently failing - and wait for it to be released? Then just use pypa/installer instead of re-implementing it? |
There was a problem hiding this comment.
I have copied my comments from the other PR so they will not be forgotten.
I agree that we should prefer a solution in installer if possible. Another approach to avoid calculating the hashes twice might be to add a "use hashes from RECORD file" parameter to the install() function that can be set if validation was successful.
By the way, what's the performance penalty for calculating the hashes twice for a big wheel like torch?
| computed_record = next( | ||
| record for _, record in records if record.path == item.filename | ||
| ) |
There was a problem hiding this comment.
This doesn't always work. I tested a large project and got a StopIteration for some packages.
You can reproduce it by running poetry add fonttools for example.
What's even worth: The package is installed, but the RECORD file is not written.
poetry run pip uninstall fonttools now throws ERROR: Cannot uninstall fonttools 4.39.4, RECORD file not found. Hint: The package was installed by Poetry 1.6.0.dev0. and poetry run pip install fonttools gives Requirement already satisfied.
You can only "manually" uninstall the package.
| self._validate_hash_and_size(records) | ||
| return super().finalize_installation(scheme, record_file_path, records) |
There was a problem hiding this comment.
| self._validate_hash_and_size(records) | |
| return super().finalize_installation(scheme, record_file_path, records) | |
| try: | |
| self._validate_hash_and_size(records) | |
| finally: | |
| return super().finalize_installation(scheme, record_file_path, records) |
I wonder if we should do something like that to make sure the RECORD file is written despite of bugs in _validate_hash_and_size(). However, when I tried this, the exception was swallowed, which not good either. Thus, we would have to put more effort here...
It's almost double. Here's the benchmark while installing https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp39-cp39-linux_x86_64.whl (2.1 GB) performed on Poetry 1.5.1 and pypa/installer@5f281b8 (ie with the hash in-memory computation fix)
cc @Secrus as now maintainer of I agree this is not the best place to fix it, it was a quick hack in case we wanted to keep the feature in 1.5.1. I'll open an issue on |
|
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Re-introduce wheel content file validation in a more performant way, aka computing (and validating) hash once while installing the wheel (and verifying the wheel at the end of the installation process).
Related to #7983 and #7987