Skip to content

fix: terminate worker pool on short-circuit returns from pnpm/main#11571

Merged
zkochan merged 2 commits into
mainfrom
fix-hang
May 11, 2026
Merged

fix: terminate worker pool on short-circuit returns from pnpm/main#11571
zkochan merged 2 commits into
mainfrom
fix-hang

Conversation

@zkochan
Copy link
Copy Markdown
Member

@zkochan zkochan commented May 11, 2026

Summary

pnpm could hang for the lifetime of the worker pool after a command logically finished whenever the code path returned through main without process.exit(...).

main.ts only ran finishWorkers() inside the command-handler finally block. Short-circuit returns that came earlier — --version, --help, fatal config errors that surface before a command runs — bypassed it and left the integrity-resolution worker pool active.

The CLI entry (pnpm/src/pnpm.ts) now runs finishWorkers() in its own finally, so every exit path tears down the pool. Cleanup rejections are swallowed so the finally never masks the real command result.

Reproducer

echo '{"devEngines":{"packageManager":{"name":"pnpm","version":"11.0.9","onFail":"download"}}}' > package.json
time pnpm --version
# before (Linux): prints 11.0.9, then hangs for the worker-pool lifetime
# after:          prints 11.0.9, exits in ~1-3 s

switchCliVersion resolves the integrity (spawning workers), finds nothing to swap (range/version already matches running pnpm), and returns through main's --version short-circuit. The unterminated workers used to keep the event loop alive.

Surfaced in pnpm/action-setup#254 — bumping the action's bootstrap to 11.0.9 made every CI job using devEngines.packageManager with onFail: "download" hang.

Test plan

  • New regression test in pnpm/test/packageManagerCheck.test.ts (pnpm --version exits promptly when devEngines.packageManager matches the running pnpm). With finishWorkers() removed from pnpm.ts the test times out and fails on Linux; with the fix it passes in ~6 s.
  • Existing pnpm/test/switchingVersions + packageManagerCheck suites still pass (27 + 28 tests).
  • Reproducer above exits in ~1-3 s in a node:24 linux/amd64 container.

Summary by CodeRabbit

  • Bug Fixes
    • Fixed a CLI hang so commands like --version and --help exit promptly by ensuring worker cleanup runs on all exit paths.
  • Tests
    • Added a regression test to verify pnpm --version exits quickly when workspace settings match the running pnpm version.
  • Chores
    • Added a release changeset documenting the fix.

Review Change Stack

Review Change Stack

Copilot AI review requested due to automatic review settings May 11, 2026 00:31
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c14eb22a-19ac-4b51-aa0e-a5bc176b39d6

📥 Commits

Reviewing files that changed from the base of the PR and between ed81b5a and c4310d6.

📒 Files selected for processing (1)
  • pnpm/test/packageManagerCheck.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • pnpm/test/packageManagerCheck.test.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Agent
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Compile & Lint

📝 Walkthrough

Walkthrough

The PR ensures runPnpm() always awaits finishWorkers() in a finally block so worker pools are terminated on all exit paths. Adds a regression test for pnpm --version and a changeset describing the fix.

Changes

Worker Pool Cleanup on All Exit Paths

Layer / File(s) Summary
CLI Entry Point Worker Cleanup
pnpm/src/pnpm.ts
runPnpm() dynamically imports finishWorkers, runs main(argv) with error handling, and guarantees await finishWorkers() in a finally block (cleanup errors ignored).
Regression Test
pnpm/test/packageManagerCheck.test.ts
New Jest test captures the current version, calls prepareEmpty(), sets devEngines.packageManager to that version, then runs pnpm --version with a 30s timeout and asserts exit code 0 and exact stdout.
Release Notes
.changeset/version-exit-finishes-workers.md
Changeset for a patch release describing the CLI hang fix and a repro scenario when early exits previously bypassed cleanup.
sequenceDiagram
  participant runPnpm
  participant main as main(argv)
  participant errorHandler
  participant finishWorkers as finishWorkers
  runPnpm->>main: call main(argv)
  alt main throws
    main-->>runPnpm: throw
    runPnpm->>errorHandler: call errorHandler(err)
  else main returns
    main-->>runPnpm: return
  end
  runPnpm->>finishWorkers: finally -> await finishWorkers()
  finishWorkers-->>runPnpm: resolved
Loading

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 I hopped to close the loop so neat,
workers finished, no more heartbeat,
version prints and then we’re done,
a tidy run — hop! — fix well spun. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main change: fixing worker pool termination on short-circuit returns from pnpm/main, which is the core issue addressed across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-hang

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes pnpm processes hanging after completing work by scoping undici’s optimized global dispatcher and the worker pool lifecycle to the CLI entrypoint, ensuring they’re always torn down even on short-circuit returns (e.g. pnpm --version, --help, config errors).

Changes:

  • Remove @pnpm/network.fetch’s module-load side effect that installed a global undici dispatcher; replace with explicit installGlobalDispatcher() / closeGlobalDispatcher() exports.
  • Call installGlobalDispatcher() at CLI startup and run finishWorkers() + closeGlobalDispatcher() in the CLI entry’s finally block.
  • Wire @pnpm/network.fetch into pnpm’s workspace refs/deps and add a changeset documenting the fix.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pnpm/tsconfig.json Adds TS project reference so the CLI can consume @pnpm/network.fetch.
pnpm/src/pnpm.ts Installs dispatcher at startup and tears down dispatcher + workers in finally.
pnpm/package.json Adds @pnpm/network.fetch dependency to the pnpm package.
pnpm-lock.yaml Records the new workspace dependency link for @pnpm/network.fetch.
network/fetch/src/index.ts Re-exports new dispatcher lifecycle functions from the package entry.
network/fetch/src/dispatcher.ts Introduces explicit install/close APIs and removes global-dispatcher side effect.
.changeset/dispatcher-lifecycle.md Publishes patch release notes for pnpm + @pnpm/network.fetch.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pnpm/src/pnpm.ts Outdated
Comment thread network/fetch/src/dispatcher.ts Outdated
Comment thread network/fetch/src/dispatcher.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@network/fetch/src/dispatcher.ts`:
- Around line 65-75: The dispose hook for DISPATCHER_CACHE must await dispatcher
shutdown instead of fire-and-forgetting it; update the hook implementation so it
is async and returns/awaits the Promise from installed.close() (which calls
Agent.close()/ProxyAgent.close()) rather than using "void installed.close()";
ensure the hook awaits close() before removing entries from DISPATCHER_CACHE so
closeGlobalDispatcher() (and its call to DISPATCHER_CACHE.clear()) does not
resolve while keep-alive sockets are still open—refer to the DISPATCHER_CACHE
dispose hook, installed.close(), closeGlobalDispatcher,
installedGlobalDispatcher, previousGlobalDispatcher and setGlobalDispatcher when
making the change.

In `@pnpm/src/pnpm.ts`:
- Around line 31-33: The finally block currently awaits finishWorkers() and
closeGlobalDispatcher() directly which can overwrite the original error from
main(); change the flow to capture the primary outcome (store any error thrown
by main() in a variable, e.g. mainError) and in finally run finishWorkers() and
closeGlobalDispatcher() inside their own try/catch so their rejections are
logged/handled but do not replace mainError; after teardown, rethrow mainError
if present, otherwise if teardown produced an error throw that one—ensure you
reference the existing main()/errorHandler() flow and the functions
finishWorkers() and closeGlobalDispatcher() when implementing this preservation
of the original outcome.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d2954ba1-ed57-4138-bd5a-1461de17beb1

📥 Commits

Reviewing files that changed from the base of the PR and between e1e29c1 and 3db132b.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (6)
  • .changeset/dispatcher-lifecycle.md
  • network/fetch/src/dispatcher.ts
  • network/fetch/src/index.ts
  • pnpm/package.json
  • pnpm/src/pnpm.ts
  • pnpm/tsconfig.json
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Agent
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,tsx,jsx}: Use trailing commas in code following Standard Style with modifications
Prefer functions over classes
Declare functions after they are used, relying on hoisting
Functions should have no more than two or three arguments; use a single options object for functions needing more parameters
Organize imports in order: standard libraries, external dependencies (sorted alphabetically), then relative imports

Files:

  • network/fetch/src/index.ts
  • pnpm/src/pnpm.ts
  • network/fetch/src/dispatcher.ts
🧠 Learnings (1)
📚 Learning: 2026-05-05T23:03:04.286Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11479
File: __utils__/scripts/package.json:6-9
Timestamp: 2026-05-05T23:03:04.286Z
Learning: The pattern cross-env NODE_OPTIONS="$NODE_OPTIONS ..." in package.json scripts is an established convention in the pnpm/pnpm repository and is used across many packages (e.g., fs/hard-link-dir, worker, __utils__/scripts). Do not flag this as a cross-platform issue in individual files; if a change is needed, apply it as a repo-wide change in a separate PR. Scope this guidance to all package.json files in the repo; use the minimatch pattern '**/package.json' to identify relevant files and review changes at the repository level rather than per-file.

Applied to files:

  • pnpm/package.json

Comment thread network/fetch/src/dispatcher.ts Outdated
Comment thread pnpm/src/pnpm.ts Outdated
@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

# Scenario main branch
1 Headless (warm store+cache) 2.210s ± 0.028s 2.187s ± 0.025s
2 Re-resolution (add dep, warm) 2.864s ± 0.038s 2.877s ± 0.034s
3 Full resolution (warm, no lockfile) 4.923s ± 0.038s 4.929s ± 0.041s
4 Headless (cold store+cache) 4.494s ± 0.217s 4.629s ± 0.789s
5 Cold install (nothing warm) 8.222s ± 0.100s 8.252s ± 0.106s
6 GVS warm reinstall (warm global store) 1.053s ± 0.031s 1.041s ± 0.016s

Run 25644260272 · 30 runs per scenario · triggered by @zkochan

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
pnpm/src/pnpm.ts (1)

30-32: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cleanup failure in finally can still replace the main command outcome.

If finishWorkers() rejects here, it can override whatever happened in main()/errorHandler() and change the final failure surface. This was raised earlier and still applies.

Suggested minimal safeguard
 async function runPnpm (): Promise<void> {
   const { finishWorkers } = await import('@pnpm/worker')
   const { errorHandler } = await import('./errorHandler.js')
+  let primaryError: unknown
   try {
     const { main } = await import('./main.js')
     await main(argv)
   } catch (err: any) { // eslint-disable-line
+    primaryError = err
     await errorHandler(err)
   } finally {
-    await finishWorkers()
+    try {
+      await finishWorkers()
+    } catch (cleanupError) {
+      if (primaryError == null) throw cleanupError
+    }
   }
 }
#!/bin/bash
# Verify the current flow still awaits finishWorkers() directly in finally.
rg -n -C3 'catch \(err: any\)|finally \{|await finishWorkers\(\)' pnpm/src/pnpm.ts
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pnpm/src/pnpm.ts` around lines 30 - 32, The finally block currently does
`await finishWorkers()` which can throw and overwrite the original
main()/errorHandler() outcome; wrap the call in its own try/catch inside the
finally so cleanup errors are caught and logged but not rethrown (e.g., in the
finally of the top-level function that calls main()/errorHandler(), catch errors
from finishWorkers() and call the logger/errorHandler but do not throw),
referencing the existing finishWorkers(), main(), and errorHandler() symbols so
you locate and replace the direct await finishWorkers() with a guarded try {
await finishWorkers() } catch (err) { processLogger.error('finishWorkers
failed', err) } (or equivalent) to preserve the original exit semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@pnpm/src/pnpm.ts`:
- Around line 30-32: The finally block currently does `await finishWorkers()`
which can throw and overwrite the original main()/errorHandler() outcome; wrap
the call in its own try/catch inside the finally so cleanup errors are caught
and logged but not rethrown (e.g., in the finally of the top-level function that
calls main()/errorHandler(), catch errors from finishWorkers() and call the
logger/errorHandler but do not throw), referencing the existing finishWorkers(),
main(), and errorHandler() symbols so you locate and replace the direct await
finishWorkers() with a guarded try { await finishWorkers() } catch (err) {
processLogger.error('finishWorkers failed', err) } (or equivalent) to preserve
the original exit semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6b2f87b1-2f01-466b-a971-b379b3af2845

📥 Commits

Reviewing files that changed from the base of the PR and between 3db132b and c27d551.

📒 Files selected for processing (3)
  • .changeset/version-exit-finishes-workers.md
  • pnpm/src/pnpm.ts
  • pnpm/test/packageManagerCheck.test.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/version-exit-finishes-workers.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{js,ts,tsx,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{js,ts,tsx,jsx}: Use trailing commas in code following Standard Style with modifications
Prefer functions over classes
Declare functions after they are used, relying on hoisting
Functions should have no more than two or three arguments; use a single options object for functions needing more parameters
Organize imports in order: standard libraries, external dependencies (sorted alphabetically), then relative imports

Files:

  • pnpm/test/packageManagerCheck.test.ts
  • pnpm/src/pnpm.ts
**/*.test.{js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Do not use instanceof Error for error type checking in Jest tests; use util.types.isNativeError() instead to ensure compatibility across realms

Files:

  • pnpm/test/packageManagerCheck.test.ts
🔇 Additional comments (1)
pnpm/test/packageManagerCheck.test.ts (1)

179-205: Strong regression coverage for the short-circuit hang path.

This test is well-targeted: it reproduces the previously hanging --version flow and asserts both timely exit and correct output.

Copilot AI review requested due to automatic review settings May 11, 2026 10:09
@zkochan zkochan changed the title fix: scope global undici dispatcher to the CLI entry lifecycle fix: terminate worker pool on short-circuit returns from pnpm/main May 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment thread pnpm/test/packageManagerCheck.test.ts Outdated
Comment thread pnpm/src/pnpm.ts Outdated
Comment thread .changeset/version-exit-finishes-workers.md Outdated
Comment thread pnpm/src/pnpm.ts
`pnpm --version` short-circuited `main.ts` after `switchCliVersion` had
already spawned the integrity-resolution worker pool, then returned
before reaching the command-handler `finally` block that calls
`finishWorkers()`. The pool stayed alive and kept the Node event loop
open long after the intended exit.

Reproducer: `pnpm --version` in a workspace whose
`devEngines.packageManager.version` already matches the running pnpm +
`onFail: "download"`. `switchCliVersion` resolves the integrity (spawning
workers), finds nothing to swap, returns through main's `--version`
short-circuit. Workers stay alive.

The CLI entry now runs `finishWorkers()` in its own `finally`, covering
this and any other early-return path that bypasses the in-main cleanup.
Cleanup rejections are swallowed so the `finally` never masks the real
command result.
The "--version exits promptly" test ran `execPnpmSync(['--version'])`
from the previous test's prepared dir (`devEngines.packageManager: yarn,
onFail: error`). That manifest's pm check runs before main.ts's
`--version` short-circuit, so the call exited 1 with empty stdout and
`pnpmVersion` was the empty string. Read the version from a fresh empty
dir instead.
Copilot AI review requested due to automatic review settings May 11, 2026 10:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

@zkochan zkochan merged commit 91b0e64 into main May 11, 2026
17 checks passed
@zkochan zkochan deleted the fix-hang branch May 11, 2026 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants