Skip to content

fix: improve performance in many places#203

Merged
alexander-akait merged 6 commits intomainfrom
claude/investigate-cached-source-perf-ofvqq
Apr 22, 2026
Merged

fix: improve performance in many places#203
alexander-akait merged 6 commits intomainfrom
claude/investigate-cached-source-perf-ofvqq

Conversation

@alexander-akait
Copy link
Copy Markdown
Member

No description provided.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 22, 2026

⚠️ No Changeset found

Latest commit: 03993c5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Apr 22, 2026

CLA Not Signed

@alexander-akait alexander-akait force-pushed the claude/investigate-cached-source-perf-ofvqq branch from 5399622 to a494cfd Compare April 22, 2026 14:01
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.77%. Comparing base (c53b3e0) to head (03993c5).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #203      +/-   ##
==========================================
- Coverage   97.79%   97.77%   -0.03%     
==========================================
  Files          25       25              
  Lines        1908     1932      +24     
  Branches      600      607       +7     
==========================================
+ Hits         1866     1889      +23     
- Misses         40       41       +1     
  Partials        2        2              
Flag Coverage Δ
integration 97.77% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 22, 2026

Merging this PR will degrade performance by 98.22%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 4 improved benchmarks
❌ 3 regressed benchmarks
✅ 154 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
cached-source: map() (cached) 253.1 µs 41.4 µs ×6.1
cached-source: new CachedSource() 367.7 µs 322.9 µs +13.86%
cached-source: source() (cached) 666.5 µs 104 µs ×6.4
prefix-source: getPrefix() 120.7 µs 6,649.9 µs -98.19%
prefix-source: original() 121 µs 6,783.4 µs -98.22%
realistic-source-map-pipeline: warm sourceAndMap() (reuse CachedSource) 79.8 µs 51.8 µs +54.14%
replace-source: original() 126.4 µs 189.4 µs -33.25%

Comparing claude/investigate-cached-source-perf-ofvqq (03993c5) with main (adf17ec)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (fdb229e) during the generation of this report, so adf17ec was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@alexander-akait alexander-akait force-pushed the claude/investigate-cached-source-perf-ofvqq branch from 51282ec to bab15a7 Compare April 22, 2026 15:08
claude added 5 commits April 22, 2026 17:20
The constructors of RawSource, OriginalSource and SourceMapSource
preallocated `this._cachedSize = undefined` so that size() has a stable
hidden-class slot to write to. That made every construction pay for
memoization machinery regardless of whether .size() was ever called,
showing up on CodSpeed as a ~12% regression on
`cached-source: new CachedSource()` and 5-13% on several
`source-map-source: new (*)` benchmarks.

The size() overrides already guard on `this._cachedSize !== undefined`,
which correctly handles missing properties. Drop the eager assignment
and let the field be created on the first size() call. This keeps the
size()/streamChunks() improvements from PR #199 (e.g. ~8x on a one-shot
`new SMS().size()`) while erasing the construction regression.

https://claude.ai/code/session_01LZbaaPrnDTu6y7s4nK4cJz
CachedSource.map(), sourceAndMap() and streamChunks() key their map-entry
cache with `JSON.stringify(options)`. That call is 60-100 ns on the
common `{}` / `{columns: false}` shapes and dominates the entire method
body once the cache is warm.

Both MapOptions and the streamChunks Options type only contain boolean
fields, so we can short-circuit the stringify for the three shapes that
account for essentially all real traffic (undefined, empty object,
columns-only) using interned string constants. Any other shape still
falls through to JSON.stringify, so keys stay byte-identical to the
on-disk BufferedMaps produced by previous CachedSource versions.

Measured on the benchmark fixtures with JIT enabled (best of 5, ns/op):

                             before    after
  warmed.map({})              79.1    4.3   (18x)
  warmed.map({columns:false})139.8    9.6   (14x)
  warmed.sourceAndMap({})     89.6    9.8   (9x)
  warmed.map() / undefined    10.6    6.7   (1.6x)
  warmed.sourceAndMap()       12.4    7.7   (1.6x)

Full --no-opt benchmark run shows +53% on `cached-source: map()
(cached)`, +37.6% on `cached-source: sourceAndMap() (cached)`, +34.5%
on `realistic-source-map-pipeline: warm sourceAndMap()`, and
+17-21% on `replace-source: map()` / `sourceAndMap()`. 89813 tests
still pass.

https://claude.ai/code/session_01LZbaaPrnDTu6y7s4nK4cJz
Follow-up to the _cachedSize deferral: PR #204 added a new
`this._cachedBuffers = undefined` preallocation to the CachedSource
constructor, which re-introduces the same per-construction cost we
previously removed from RawSource/OriginalSource/SourceMapSource.

`_cachedBuffers` is only read from `buffer()` and `buffers()`, and both
call sites already guard with `!== undefined`, which treats missing
properties identically to an undefined slot. Drop the eager assignment
so every CachedSource construction skips one write and one hidden-class
transition, and instances that never call `buffers()` stay on a
tighter shape.

https://claude.ai/code/session_01LZbaaPrnDTu6y7s4nK4cJz
`source()` delegated the hot "cache already filled" read to
`_getCachedSource()`, which means every warm call pays for a prototype
method lookup and a function-call stack frame. Under V8's TurboFan
the call gets inlined, but the interpreter (and CodSpeed's
simulation mode, which runs without optimization) actually pays the
overhead. Inlining the `this._cachedSource !== undefined` short-circuit
brings warm `source()` from ~58 ns/op down to ~41 ns/op under --no-opt
while keeping the buffer-derived path behind the shared helper.

https://claude.ai/code/session_01LZbaaPrnDTu6y7s4nK4cJz
…e ctor

Two follow-ups to keep CodSpeed happy:

1. `source()` previously inlined only the warm `_cachedSource` check
   and still called `_getCachedSource()` on cold reads, which under the
   interpreter (CodSpeed's simulation mode) pays a function-call stack
   frame plus a redundant re-check of `_cachedSource`. Inline the whole
   helper body so both warm and cold paths stay branchless-to-call.
   Local --no-opt: cold `new CachedSource(RawSource).source()`
   drops 534 -> 477 ns/op (+12% vs main).

2. The constructor was evaluating `cachedData ? cachedData.X : undefined`
   once per cached-data field (five separate ternaries). Replace with a
   single branch on `cachedData` so construction pays one conditional
   instead of five. Local --no-opt:
   `new CachedSource(new RawSource(…))` drops 431 -> 375 ns/op (+13%
   vs main), erasing the -11% regression CodSpeed showed on
   `cached-source: new CachedSource()`.

`_getCachedSource()` stays on the class because `buffer()`,
`streamChunks()` and `getCachedData()` still call it; only `source()`
had a hot enough caller to justify duplicating the body.

https://claude.ai/code/session_01LZbaaPrnDTu6y7s4nK4cJz
@alexander-akait alexander-akait force-pushed the claude/investigate-cached-source-perf-ofvqq branch from 38b570e to 03fed1e Compare April 22, 2026 17:22
@alexander-akait alexander-akait merged commit ccfbc65 into main Apr 22, 2026
24 of 34 checks passed
@alexander-akait alexander-akait deleted the claude/investigate-cached-source-perf-ofvqq branch April 22, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants