persistent cache for link parsing and interpreter compatibility#12258
Draft
cosmicexplorer wants to merge 9 commits intopypa:mainfrom
Draft
persistent cache for link parsing and interpreter compatibility#12258cosmicexplorer wants to merge 9 commits intopypa:mainfrom
cosmicexplorer wants to merge 9 commits intopypa:mainfrom
Conversation
c5c6466 to
c468bc3
Compare
2fa96fc to
9ab73d3
Compare
34511b7 to
58a145a
Compare
eeccc32 to
7a17fd3
Compare
1 task
7a17fd3 to
f007a62
Compare
10 tasks
Member
When performing `install --dry-run` and PEP 658 .metadata files are available to guide the resolve, do not download the associated wheels. Rather use the distribution information directly from the .metadata files when reporting the results on the CLI and in the --report file. - describe the new --dry-run behavior - finalize linked requirements immediately after resolve - introduce is_concrete - funnel InstalledDistribution through _get_prepared_distribution() too - add test for new install --dry-run functionality (no downloading)
f007a62 to
cc21192
Compare
- catch an exception when parsing metadata which only occurs in CI - handle --no-cache-dir - call os.makedirs() before writing to cache too - catch InvalidSchema when attempting git urls with BatchDownloader - fix other test failures - reuse should_cache(req) logic - gzip compress link metadata for a slight reduction in disk space - only cache built sdists - don't check should_cache() when fetching - cache lazy wheel dists - add news - turn debug logs in fetching from cache into exceptions - use scandir over listdir when searching normal wheel cache - handle metadata email parsing errors - correctly handle mutable cached requirement - use bz2 over gzip for an extremely slight improvement in disk usage - handle new google_paste encoding breakage
cc21192 to
0ff2695
Compare
d7615e2 to
620d7b0
Compare
- pipe in headers arg - provide full context in Link.comes_from - pull in etag and date and cache the outputs - handle --no-cache-dir - add NEWS - remove quotes from etag and use binary checksum to save a few bytes - parse http modified date to compress the cached representation - fix cache-control clobbering
- also compress the link parsing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #12184.
This change is on top of #12257, see the
+264/-116diff against at https://github.com/cosmicexplorer/pip/compare/link-parsing-cache...cosmicexplorer:pip:interpreter-compatibility-cache?expand=1.Background: Caching
LinkParsing In-MemoryIn #7729, we cached the result of parsing an index for
Links in memory, to avoid redoing it more than once per run. Given that this result changes as packages upload new versions to pypi, it didn't seem feasible to cache this persistently like we can do with #12256.Proposal: Cache
LinkParsing, then Interpreter CompatibilitySee #12184 for an investigation into caching strategies.
If (as per #12257) we recognize that we have a cached HTTP response (which means no new versions of a package have been uploaded since we last checked), we can then implement two more levels of caching:
Links, and write that to a json file in the cache directory.Links to find the ones that match the current interpreter/platform, we can write that to a cache file as well (we have to ensure these are all invalidated when the HTTP response changes).Result: Some Performance Improvement
This all turns out to be possible, but with some caveats:
Discussion
I've made this a draft because I'd like to gather opinions on whether this would be worth merging.