Skip to content

versions: fast path for --bare --skip-aliases#3411

Merged
native-api merged 1 commit intopyenv:masterfrom
jakelodwick:pr/02-inline-versions
Mar 4, 2026
Merged

versions: fast path for --bare --skip-aliases#3411
native-api merged 1 commit intopyenv:masterfrom
jakelodwick:pr/02-inline-versions

Conversation

@jakelodwick
Copy link
Copy Markdown
Contributor

@jakelodwick jakelodwick commented Mar 1, 2026

pyenv-rehash calls pyenv-versions --bare --skip-aliases to list installed versions. This code path resolves every symlink in $versions_dir using realpath (a subshell with cd + readlink loop when the native extension isn't compiled) and sorts the result — neither of which rehash needs.

This adds a fast path at the top of pyenv-versions that activates when --bare and --skip-aliases are both set. It skips sort, the native extension probe, and replaces per-symlink realpath with a single readlink call — relative target = internal alias (skip), absolute target = external install (keep).

pyenv-rehash is unchanged.

Benchmark (eval "$(pyenv init - zsh)", Apple M2, 3 versions + 12 virtualenvs + 14 symlinks, native ext not compiled, 20 iterations):

  • Before: 176.8 ms median
  • After: 123.1 ms median (−53.7 ms, 30%)

All 233 tests pass.

@jakelodwick jakelodwick requested review from a team as code owners March 1, 2026 23:36
Copy link
Copy Markdown
Member

@native-api native-api left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are to inline pyenv-versions, it probably has to be a shared function, to ensure that the logic will stay the same going forward.

Since the pyenv launcher is not actually involved as per below, you probably rather want to streamline pyenv-versions logic for the specific call during rehash.

E.g.

  • we can skip calling sort, and
  • --skip-aliases may also be unneeded depending on what takes longer -- filtering them out or traversing those directories twice during rehash
    • note that there may be dozens of virtualenv directories with 1-2 dozen of executables in each (e.g. popular package suites and their dependencies)
    • alternatively, we may simplify link checking logic for this case to readlink (without switches) and checking that it's a relative link matching a pattern
  • we may skip the command line parsing logic if the command line is a specific value

If you wish to optimize rehash, it may be most productive to set PS4 in the launcher to something featuring a timestamp -- to see what actually takes the most time in realistic scenarios.

Comment thread libexec/pyenv-rehash
Comment thread libexec/pyenv-rehash Outdated
@jakelodwick
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review. Both inline corrections are right — let me address them and share some profiling data.

Dispatcher claim: You're correct that pyenv-versions (with dash) is called directly, not through the bin/pyenv launcher. The overhead is one subprocess, not two. I'll fix the PR description.

--skip-aliases semantics: Also correct — my code skips all symlinks, but --skip-aliases only skips those whose realpath resolves within $versions_dir or $versions_dir/*/envs/*. A symlink pointing to an external path (e.g. a system build linked into versions/) should be kept.

Profiling results

I followed your PS4 suggestion and profiled pyenv-rehash with timestamp-annotated xtrace, plus instrumented phase timing without xtrace overhead. Three fixtures: empty (0 versions), standard (3 versions, 12 virtualenvs, 320 executables), and large (6 versions, 60 virtualenvs, 1466 executables).

Phase breakdown (standard fixture, median of 3 runs, no xtrace):

Phase                      Time     %
──────────────────────────────────────
pyenv-versions + echo loop   77 ms   44%
make_shims (register)        14 ms    8%
install_registered_shims     63 ms   36%
hooks (pyenv-hooks)           5 ms    3%
remove_stale_shims            8 ms    5%
prototype + setup            10 ms    6%
──────────────────────────────────────
Total                       177 ms

Large fixture scales to 640 ms total, with make_shims jumping to 150 ms (bash 3.2 string concatenation is O(n²) on the registered_shims string at 1466 names).

Inside pyenv-versions, the costs are:

  • realpath calls: 15 calls × ~0.3 ms each (the shell resolve_link fallback, since native ext isn't compiled)
  • sort --version-sort: a few ms (irrelevant for rehash)
  • Subprocess overhead: ~20 ms even with 0 versions

Revised approach

I'd like to replace the current diff with an inline version that:

  1. Uses readlink (single call, no loop) instead of realpath to check aliases — relative target = alias/virtualenv link (skip), absolute target = external (keep). This matches what --skip-aliases does in practice since pyenv-virtualenv creates relative symlinks.
  2. Enumerates envs/*/bin/* alongside bin/* — the current code does this because pyenv-versions --bare lists virtualenv envs.

This avoids both the shared-function complexity and the full realpath resolution cost. The rehash call doesn't need sort, display formatting, or argument parsing — just the directory walk with correct filtering.

Measured savings with correct semantics:

               Standard    Large
Current          175 ms    640 ms
Inline           138 ms    343 ms
Saved             37 ms    297 ms

Does this direction look reasonable, or would you prefer I focus elsewhere first?

@jakelodwick jakelodwick force-pushed the pr/02-inline-versions branch from 1abc95e to fbb2c03 Compare March 2, 2026 07:44
@jakelodwick
Copy link
Copy Markdown
Contributor Author

Revised to streamline pyenv-versions rather than inlining in pyenv-rehash, following your suggestions.

Before: 176.8 ms median
After:  123.1 ms median  (−53.7 ms, 30%)

(Apple M2, zsh, 3 versions + 12 virtualenvs + 14 symlinks, native ext not compiled, 20 iterations)

The diff adds a fast path in pyenv-versions when --bare --skip-aliases are both set: skip sort, replace realpath with readlink (relative = alias → skip, absolute = external → keep), skip native ext probe. pyenv-rehash is unchanged. All 233 tests pass.

@jakelodwick jakelodwick changed the title rehash: Replace pyenv-versions subprocess with inline glob versions: fast path for --bare --skip-aliases Mar 2, 2026
@native-api
Copy link
Copy Markdown
Member

Good work! I'm impressed!
The presented result data is valuable but is however partial and does not quite support the suggestion (details follow).
But it's nevertheless a foundation!


So, for starters, could you share the data for all 3 fixtures and ideally, some code (at least, the key parts) that got you these results (e.g. as an archive attachment to a message)?

That way, not only can we relatively quickly compare it to other approaches, we could also see how the time distribution compares to Linux with Bash 5.

Besides the current issue, that code will be of great help to tackle our other performance-related issues.
I did some own research when dealing with another such issue and did not readily find any existing Bash profiling tools or even polished techniques -- so anything helps!


I really don't like the idea of having 2 distant pieces of code that implement the same business logic:

  • that's an anti-pattern ("Intersperser"), they are bound to desynchronize at some point and cause subtle bugs
  • this way, any optimizations we do will not carry over to other uses of pyenv-versions (several other subcommands call it internally)

So if we are going that way, we want to be really sure there is no comparable alternative without that big drawback.


  • by your own admission, in the longest phase, pyenv-versions + echo loop, the pyenv-versions call appears to take ~25ms out of 77ms (actually, it's unclear how much it takes, that's one way to interpret the figures given). So it looks like comparable time could potentially be saved by rather eliminating the "echo loop" -- by e.g. reading everything into an array at once, or maybe even one by one (because that would still eliminate extra I/O via pipes).
    • you did not give the figures for other fixtures -- especially the "large" one where you claim by far the largest gain
  • other figures given also suggest other comparable if not greater hot spots, e.g.:
    • repeated string concatenation -- can e.g. be replaced with an array for the Bash 3 path
    • realpath can be replaced with pwd -P
      • in the use case for rehash, we can indeed get away with a "quick and dirty" way without it
      • maybe we can get away even without filtering versions at all: all we need is a deduped list of executable names from everywhere. Globbing and piping within C code may be faster than looping and resolving links in Bash code.
      • even without all this, as per your data, realpath doesn't seem to be a hotspot in the first place
  • even if we go with inlining, now that we have $_PYENV_INSTALL_PREFIX, we can put shared code into $_PYENV_INSTALL_PREFIX/lib like RVM does.

@native-api
Copy link
Copy Markdown
Member

(My last message doesn't take yours into account. I see that you've already addressed some of the concerns expressed there.)

@jakelodwick
Copy link
Copy Markdown
Contributor Author

Thanks — your suggestion to move the fast path inside pyenv-versions rather than inlining into rehash made the v2 approach much cleaner.

Here's the full data across all 3 fixtures.

Wall-clock timing (eval "$(pyenv init - zsh)", Apple M2, native ext not compiled, median of 10):

Fixture Upstream With fast path Saved
Empty (0 versions) 71.9 ms 66.7 ms 5.2 ms (7%)
Standard (3 ver, 12 venvs, 320 bins) 168.2 ms 112.3 ms 55.9 ms (33%)
Large (6 ver, 60 venvs, 1466 bins) 632.7 ms 539.0 ms 93.7 ms (15%)

Phase breakdown (from timestamp-annotated xtrace, percentages applied to wall-clock totals):

Empty — 71.9 ms:

list_executable_names    ~26 ms  (36%)   — pyenv-versions subprocess + echo loop
hooks (pyenv-hooks)       ~5 ms   (7%)
init overhead            ~41 ms  (57%)   — outside rehash

Standard — 168.2 ms:

list_executable_names    ~74 ms  (44%)   — pyenv-versions (14 realpath calls) + echo loop
install_registered_shims ~61 ms  (37%)   — cp per shim file
make_shims                ~8 ms   (5%)
hooks                     ~5 ms   (3%)
init overhead            ~20 ms  (12%)

Large — 632.7 ms:

list_executable_names   ~134 ms  (21%)   — pyenv-versions (30 realpath calls) + echo loop
make_shims              ~414 ms  (65%)   — O(n²) string concat with 1466 names (bash 3.2)
install_registered_shims ~63 ms  (10%)
everything else          ~22 ms   (3%)

The large fixture is interesting — make_shims completely dominates at that scale. The echo loop elimination and array replacement you mentioned would hit that directly. Good follow-up targets.

Profiling tools

pyenv-profiling-tools.tar.gz — three self-contained scripts:

  • bench.sh — wall-clock timing of eval "$(pyenv init - <shell>)" with comparison mode and fork counting (zsh)
  • profile-rehash.sh — timestamp-annotated xtrace with phase breakdown and external command summary (zsh)
  • make-fixture.sh — creates realistic versions/ layouts, standard and --large (bash)

They expect a pyenv checkout path as the first argument. bench.sh uses $EPOCHREALTIME — on Bash 5 that's native, so the python3 fallback for macOS bash 3.2 won't be needed. Hope these are useful beyond this PR.

Comment thread libexec/pyenv-versions
Skip sort, native extension probe, and per-symlink realpath
when called with --bare --skip-aliases (the rehash case).
Use readlink to distinguish internal aliases (relative target)
from external installs (absolute target).
@jakelodwick jakelodwick force-pushed the pr/02-inline-versions branch from fbb2c03 to 27e216f Compare March 4, 2026 05:56
@native-api
Copy link
Copy Markdown
Member

native-api commented Mar 4, 2026

Thank you very much!
I referenced this PR in a number of performance-related issues. This should help us a lot moving forward!

@jakelodwick
Copy link
Copy Markdown
Contributor Author

Glad it's useful beyond this PR. Looking forward to the next one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants