versions: fast path for --bare --skip-aliases#3411
Conversation
There was a problem hiding this comment.
If we are to inline pyenv-versions, it probably has to be a shared function, to ensure that the logic will stay the same going forward.
Since the pyenv launcher is not actually involved as per below, you probably rather want to streamline pyenv-versions logic for the specific call during rehash.
E.g.
- we can skip calling
sort, and --skip-aliasesmay also be unneeded depending on what takes longer -- filtering them out or traversing those directories twice during rehash- note that there may be dozens of virtualenv directories with 1-2 dozen of executables in each (e.g. popular package suites and their dependencies)
- alternatively, we may simplify link checking logic for this case to
readlink(without switches) and checking that it's a relative link matching a pattern
- we may skip the command line parsing logic if the command line is a specific value
If you wish to optimize rehash, it may be most productive to set PS4 in the launcher to something featuring a timestamp -- to see what actually takes the most time in realistic scenarios.
|
Thanks for the detailed review. Both inline corrections are right — let me address them and share some profiling data. Dispatcher claim: You're correct that
Profiling results I followed your PS4 suggestion and profiled Phase breakdown (standard fixture, median of 3 runs, no xtrace): Large fixture scales to 640 ms total, with Inside
Revised approach I'd like to replace the current diff with an inline version that:
This avoids both the shared-function complexity and the full Measured savings with correct semantics: Does this direction look reasonable, or would you prefer I focus elsewhere first? |
1abc95e to
fbb2c03
Compare
|
Revised to streamline (Apple M2, zsh, 3 versions + 12 virtualenvs + 14 symlinks, native ext not compiled, 20 iterations) The diff adds a fast path in |
|
Good work! I'm impressed! So, for starters, could you share the data for all 3 fixtures and ideally, some code (at least, the key parts) that got you these results (e.g. as an archive attachment to a message)? That way, not only can we relatively quickly compare it to other approaches, we could also see how the time distribution compares to Linux with Bash 5. Besides the current issue, that code will be of great help to tackle our other performance-related issues. I really don't like the idea of having 2 distant pieces of code that implement the same business logic:
So if we are going that way, we want to be really sure there is no comparable alternative without that big drawback.
|
|
(My last message doesn't take yours into account. I see that you've already addressed some of the concerns expressed there.) |
|
Thanks — your suggestion to move the fast path inside Here's the full data across all 3 fixtures. Wall-clock timing (
Phase breakdown (from timestamp-annotated xtrace, percentages applied to wall-clock totals): Empty — 71.9 ms: Standard — 168.2 ms: Large — 632.7 ms: The large fixture is interesting — Profiling tools pyenv-profiling-tools.tar.gz — three self-contained scripts:
They expect a pyenv checkout path as the first argument. |
Skip sort, native extension probe, and per-symlink realpath when called with --bare --skip-aliases (the rehash case). Use readlink to distinguish internal aliases (relative target) from external installs (absolute target).
fbb2c03 to
27e216f
Compare
|
Thank you very much! |
|
Glad it's useful beyond this PR. Looking forward to the next one. |
pyenv-rehashcallspyenv-versions --bare --skip-aliasesto list installed versions. This code path resolves every symlink in$versions_dirusingrealpath(a subshell withcd+readlinkloop when the native extension isn't compiled) and sorts the result — neither of which rehash needs.This adds a fast path at the top of
pyenv-versionsthat activates when--bareand--skip-aliasesare both set. It skips sort, the native extension probe, and replaces per-symlinkrealpathwith a singlereadlinkcall — relative target = internal alias (skip), absolute target = external install (keep).pyenv-rehashis unchanged.Benchmark (
eval "$(pyenv init - zsh)", Apple M2, 3 versions + 12 virtualenvs + 14 symlinks, native ext not compiled, 20 iterations):All 233 tests pass.