Skip to content

fix(core): optimize pnpm lockfile parsing with pre-built indexes#33750

Merged
leosvelperez merged 2 commits intonrwl:masterfrom
adwait1290:perf/pnpm-parser-optimizations
Dec 10, 2025
Merged

fix(core): optimize pnpm lockfile parsing with pre-built indexes#33750
leosvelperez merged 2 commits intonrwl:masterfrom
adwait1290:perf/pnpm-parser-optimizations

Conversation

@adwait1290
Copy link
Copy Markdown
Contributor

@adwait1290 adwait1290 commented Dec 8, 2025

Current Behavior

When parsing pnpm lockfiles to build the project graph:

matchPropValue function

  • Uses separate Object.values() + Object.keys() iterations
  • Two passes over the same data to find a value and get its key

Hoisted Dependencies Lookup

  • For each package, calls Object.keys(hoistedDeps).find(k => k.startsWith(...))
  • O(n) search performed m times = O(n*m) total

Expected Behavior

Single-pass Iteration

BEFORE: matchPropValue
┌─────────────────────────────────────────────────────────────┐
│  const index = Object.values(record).findIndex(v === key)   │
│  if (index > -1)                                            │
│    return Object.keys(record)[index]  ← Another iteration!  │
│                                                             │
│  = 2 iterations over the same data                          │
└─────────────────────────────────────────────────────────────┘

AFTER: Object.entries() single pass
┌─────────────────────────────────────────────────────────────┐
│  for (const [name, version] of Object.entries(record)) {    │
│    if (version === key) return name  ← Early exit           │
│  }                                                          │
│                                                             │
│  = 1 iteration with early exit on match                     │
└─────────────────────────────────────────────────────────────┘

Pre-built Index for Hoisted Dependencies

BEFORE: O(n*m) Hoisted Lookup
┌─────────────────────────────────────────────────────────────┐
│  for each package (n packages):                             │
│    Object.keys(hoistedDeps).find(k => k.startsWith(...))    │
│    ← O(m) search where m = hoisted deps                     │
│                                                             │
│  Total: O(n * m)                                            │
│                                                             │
│  Example: 500 packages × 200 hoisted deps                   │
│         = 100,000 string comparisons                        │
└─────────────────────────────────────────────────────────────┘

AFTER: Pre-built Index Map O(n+m)
┌─────────────────────────────────────────────────────────────┐
│  Build hoistedKeysByPackage Map once:  O(m)                 │
│    Map<packageName, hoistedKey>                             │
│                                                             │
│  for each package (n packages):                             │
│    hoistedKeysByPackage.get(packageName)  O(1)              │
│                                                             │
│  Total: O(n + m)                                            │
│                                                             │
│  Example: 500 packages + 200 hoisted deps                   │
│         = 700 operations (vs 100,000)                       │
└─────────────────────────────────────────────────────────────┘

Impact

For large pnpm monorepos:

Metric Before After Improvement
matchPropValue iterations 2 1 50% fewer iterations
Hoisted lookup complexity O(n×m) O(n+m) ~100x for large repos
String comparisons n×m n+m Dramatic reduction

Related Issue(s)

Contributes to #32669, #32254

Merge Dependencies

This PR has no dependencies and can be merged independently.

Must be merged BEFORE: #33751


PNPM Parser Optimizations:
- Use Object.entries() for single-pass iteration in matchPropValue
  instead of separate Object.values() + Object.keys() calls
- Pre-build packageName -> key index Map for hoisted dependencies
  lookup (O(n*m) -> O(n+m) where n=packages, m=hoisted deps)

Impact: Faster project graph creation for pnpm monorepos, especially
those with many packages and hoisted dependencies.
@adwait1290 adwait1290 requested review from a team and meeroslav as code owners December 8, 2025 06:25
@netlify
Copy link
Copy Markdown

netlify bot commented Dec 8, 2025

👷 Deploy request for nx-docs pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 4156b4f

@vercel
Copy link
Copy Markdown

vercel bot commented Dec 8, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
nx-dev Ready Ready Preview Dec 9, 2025 5:03pm

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown
Member

@leosvelperez leosvelperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! Please make hoistedKeysByPackage a required parameter.

@nx-cloud
Copy link
Copy Markdown
Contributor

nx-cloud bot commented Dec 10, 2025

View your CI Pipeline Execution ↗ for commit 4156b4f

Command Status Duration Result
nx affected --targets=lint,test,test-kt,build,e... ✅ Succeeded 8m 38s View ↗
nx run-many -t check-imports check-lock-files c... ✅ Succeeded 2m 32s View ↗
nx-cloud record -- nx-cloud conformance:check ✅ Succeeded 12s View ↗
nx-cloud record -- nx format:check ✅ Succeeded 2s View ↗
nx-cloud record -- nx sync:check ✅ Succeeded <1s View ↗

☁️ Nx Cloud last updated this comment at 2025-12-10 09:07:23 UTC

Copy link
Copy Markdown
Member

@leosvelperez leosvelperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@leosvelperez leosvelperez changed the title perf(nx): optimize pnpm lockfile parsing with pre-built indexes fix(core): optimize pnpm lockfile parsing with pre-built indexes Dec 10, 2025
@leosvelperez leosvelperez merged commit 58a69f9 into nrwl:master Dec 10, 2025
16 of 20 checks passed
@meeroslav
Copy link
Copy Markdown
Contributor

The hoisted calculation improvement was a good catch. On a test repo we measured 5x improvement on that code section.

Sadly, the matchPropValue is now performing worse. On test repo it brought the total duration of matching versions from ~650 to 900+ milliseconds.

Here are the reasons why:

  • Native methods are heavily optimized: findIndex() is implemented in optimized C++ code in the JavaScript engine, while for...of runs in the JavaScript layer with more overhead per iteration.
  • Object.entries() is expensive: It creates an array of arrays (tuples), which involves more object allocation than Object.values(), which creates a flat array. Even though you call both Object.values() and Object.keys(), creating two flat arrays can be faster than creating one array of tuple objects.
  • Destructuring has overhead: The [name, version] destructuring happens on every loop iteration, adding cost.
  • Array access is extremely fast: Object.keys(record)[index] is a simple array index access, which is highly optimized. Modern engine makes this nearly free.
  • The record is small, so the "double iteration" penalty is negligible due to native method optimizations.

FrozenPandaz pushed a commit that referenced this pull request Dec 15, 2025
)

## Current Behavior

When parsing pnpm lockfiles to build the project graph:

### matchPropValue function
- Uses separate `Object.values()` + `Object.keys()` iterations
- Two passes over the same data to find a value and get its key

### Hoisted Dependencies Lookup
- For each package, calls `Object.keys(hoistedDeps).find(k =>
k.startsWith(...))`
- O(n) search performed m times = O(n*m) total

## Expected Behavior

### Single-pass Iteration

```
BEFORE: matchPropValue
┌─────────────────────────────────────────────────────────────┐
│  const index = Object.values(record).findIndex(v === key)   │
│  if (index > -1)                                            │
│    return Object.keys(record)[index]  ← Another iteration!  │
│                                                             │
│  = 2 iterations over the same data                          │
└─────────────────────────────────────────────────────────────┘

AFTER: Object.entries() single pass
┌─────────────────────────────────────────────────────────────┐
│  for (const [name, version] of Object.entries(record)) {    │
│    if (version === key) return name  ← Early exit           │
│  }                                                          │
│                                                             │
│  = 1 iteration with early exit on match                     │
└─────────────────────────────────────────────────────────────┘
```

### Pre-built Index for Hoisted Dependencies

```
BEFORE: O(n*m) Hoisted Lookup
┌─────────────────────────────────────────────────────────────┐
│  for each package (n packages):                             │
│    Object.keys(hoistedDeps).find(k => k.startsWith(...))    │
│    ← O(m) search where m = hoisted deps                     │
│                                                             │
│  Total: O(n * m)                                            │
│                                                             │
│  Example: 500 packages × 200 hoisted deps                   │
│         = 100,000 string comparisons                        │
└─────────────────────────────────────────────────────────────┘

AFTER: Pre-built Index Map O(n+m)
┌─────────────────────────────────────────────────────────────┐
│  Build hoistedKeysByPackage Map once:  O(m)                 │
│    Map<packageName, hoistedKey>                             │
│                                                             │
│  for each package (n packages):                             │
│    hoistedKeysByPackage.get(packageName)  O(1)              │
│                                                             │
│  Total: O(n + m)                                            │
│                                                             │
│  Example: 500 packages + 200 hoisted deps                   │
│         = 700 operations (vs 100,000)                       │
└─────────────────────────────────────────────────────────────┘
```

## Impact

For large pnpm monorepos:

| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| matchPropValue iterations | 2 | 1 | 50% fewer iterations |
| Hoisted lookup complexity | O(n×m) | O(n+m) | ~100x for large repos |
| String comparisons | n×m | n+m | Dramatic reduction |

## Related Issue(s)

Contributes to #32669, #32254

## Merge Dependencies

This PR has no dependencies and can be merged independently.

**Must be merged BEFORE:** #33751

---
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has already been merged/closed. If you experience issues related to these changes, please open a new issue referencing this pull request.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 16, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants