perf(binding): enable mimalloc v3 to reduce idle memory#9349
Merged
shulaoda merged 2 commits intoMay 13, 2026
Merged
Conversation
✅ Deploy Preview for rolldown-rs canceled.
|
d0499b1 to
541f287
Compare
541f287 to
4ec47bc
Compare
Member
|
Can you add a design doc about why we choose mimalloc V3? |
hyf0
approved these changes
May 13, 2026
Member
|
Also can you run benchmark to compare if rolldown is affected by two different versions of mimalloc? |
…ease_dev-server_memory_to_the_os
Merged
shulaoda
added a commit
that referenced
this pull request
May 13, 2026
## [1.0.1] - 2026-05-13 ### 🚀 Features - experimental/lazy-barrel: advice on oversized barrel modules (#9236) by @shulaoda - rolldown: inline optional-chain enum access (#9379) by @Dunqing - chunk-optimization: dedupe already-loaded dynamic deps (#9305) by @IWANABETHATGUY - binding: call moduleParsed hook in ParallelJsPlugin (#9318) by @jaehafe ### 🐛 Bug Fixes - transform: enable `enum_eval` for `transformSync` and vite TS transform (#9325) by @Dunqing - error: remove severity prefix from diagnostic messages (#9262) by @Kyujenius - deps: pin pnpm to 10.23.0 to work around catalog mismatch on Netlify (#9364) by @shulaoda - ci: pin mimalloc-safe to 0.1.58 (#9361) by @shulaoda - dev/lazy: fix exports of lazy requests in lazy chunks (#9249) by @h-a-n-a - rolldown_plugin_vite_resolve: handle errors in `resolveSubpathImports` callback (#9355) by @sapphi-red - rolldown_plugin_lazy_compilation: use loadExports for fetched proxy to preserve original export names (#9132) by @h-a-n-a - common: include offending index in HybridIndexVec panic message (#9296) by @SAY-5 ### 🚜 Refactor - ecmascript: extract semantic_builder_for_transform helper (#9326) by @Dunqing - test: extract reusable static-import-cycle helper (#9332) by @IWANABETHATGUY ### 📚 Documentation - clarify scope of `topLevelVar` (#9380) by @IWANABETHATGUY - meta/design: add ast-mutation design doc (#9338) by @hyf0 - feat: add ai policy in contribution guide (#9315) by @mdong1909 ### ⚡ Performance - binding: enable mimalloc v3 to reduce idle memory (#9349) by @shulaoda ### 🧪 Testing - mcs: cover require() in `$initial` group (#9376) by @hyf0 - add regression for CJS facade chunk merge into entry (#9351) by @IWANABETHATGUY ### ⚙️ Miscellaneous Tasks - switch prepare-release to manual dispatch with version input (#9383) by @shulaoda - migrate `@rolldown/pluginutils` to `rolldown/plugins` (#9317) by @shulaoda - deps: pin libmimalloc-sys2 to 0.1.54 (#9372) by @shulaoda - replace `igorskyflyer/action-readfile` with `cat` (#9369) by @sapphi-red - deps: update test262 submodule for tests (#9371) by @rolldown-guard[bot] - use app token for test dep update PRs (#9368) by @sapphi-red - replace some actions with gh commands (#9367) by @sapphi-red - replace action-semantic-pull-request with inline regex (#9366) by @sapphi-red - remove pull_request_target workflows (#9188) by @Boshen - deps: upgrade oxc to 0.130.0 (#9360) by @shulaoda - deps: update github actions (major) (#9348) by @renovate[bot] - deps: update github actions (#9341) by @renovate[bot] - deps: update rust crates (#9344) by @renovate[bot] - deps: update crate-ci/typos action to v1.46.1 (#9357) by @renovate[bot] - deps: update npm packages (#9343) by @renovate[bot] - deps: update pnpm to v10.33.4 (#9347) by @renovate[bot] - deps: update dependency rolldown-plugin-dts to ^0.25.0 (#9346) by @renovate[bot] - .claude: add rolldown-repl encoder, rename decode skill (#9352) by @IWANABETHATGUY - deps: update crate-ci/typos action to v1.46.0 (#9345) by @renovate[bot] - deps: update napi to v3.8.6 (#9342) by @renovate[bot] - deps: update dependency vite-plus to v0.1.20 (#9340) by @renovate[bot] - enable rollup chunking-form test (#9335) by @IWANABETHATGUY - typo: fix typo in watcher options comment (#9324) by @thescripted ### ❤️ New Contributors * @Kyujenius made their first contribution in [#9262](#9262) * @SAY-5 made their first contribution in [#9296](#9296) * @thescripted made their first contribution in [#9324](#9324) Co-authored-by: shulaoda <[email protected]>
IWANABETHATGUY
pushed a commit
that referenced
this pull request
May 18, 2026
## Summary - Upgrades `mimalloc-safe` from 0.1.52 to 0.1.59 at the workspace level. - Enables the `v3` feature on every `mimalloc-safe` dependency block inside `crates/rolldown_binding/Cargo.toml` so the binding ships against mimalloc v3. - No Rust source changes — the feature is purely a `libmimalloc-sys2/build.rs` switch between `c_src/mimalloc/` (v2) and `c_src/mimalloc3/` (v3). ## Why Per the investigation on #9330, the steady-state memory of `vite dev` on a non-trivial workload (lobe-chat, lobehub-ui) is dominated by mimalloc retaining pages it has already freed. v2 on macOS effectively never returns them: a single live object pins a 64 MB segment, and the purge-delay path is conservative. v3 reworks segments into smaller sub-pages and simplifies the purge timer, so empty regions actually get `madvise(MADV_FREE_REUSABLE)`-released on the order of seconds. ## Measured impact > **Process model** > - `vite 7 + esbuild` runs as **two processes**: the Vite Node process **and** an esbuild Go child process (used for dep prebundling). Both must be accounted for. `measure.mjs` recursively walks every descendant of the Vite server and sums `Physical footprint` / RSS at each sample, so both processes are included in every row below. > - `vite 8 + [email protected]` and `vite 8 + local + v3` run as a **single Node process** — rolldown is an in-process Rust napi addon, so no child process is spawned. > > **`Σ per-proc peak` caveat** > The `Σ per-proc peak` row sums each process's *lifetime* `Physical footprint (peak)` field reported by `vmmap`. It is a **mathematical upper bound**, not a real instantaneous peak — different processes may reach their per-process peaks at different moments. This **especially inflates the `vite 7 + esbuild` column**, because esbuild typically peaks during dep prebundling while the Vite Node process peaks later during request handling; the two peaks never co-occur but get added together. Single-process columns (`[email protected]`, `local + v3`) are not affected — for them `Σ per-proc peak` equals the real instantaneous peak. [`cijiugechu/rolldown-9330-repro`](https://github.com/cijiugechu/rolldown-9330-repro) | Metric | vite 7 + esbuild | vite 8 + [email protected] | vite 8 + local + `v3` | |--------|------------------|--------------------------|------------------------| | Σ per-proc peak (upper bound) | ~2.058 G | ~2.100 G | ~1.800 G | | Physical footprint (after browser close) | ~1.971 G | ~2.000 G | ~0.971 G | | Physical footprint (90s idle) | ~1.876 G | ~1.800 G | ~0.780 G | | Physical footprint (180s idle) | ~1.876 G | ~1.800 G | ~0.780 G | | Physical footprint (270s idle) | ~0.281 G | ~1.800 G | ~0.780 G | | Physical footprint (360s idle) | ~0.281 G | ~1.800 G | ~0.780 G | | RSS (after browser close) | ~2.018 G | ~2.182 G | ~2.040 G | | RSS (90s idle) | ~1.932 G | ~2.005 G | ~1.857 G | | RSS (180s idle) | ~1.932 G | ~2.003 G | ~1.857 G | | RSS (270s idle) | ~1.932 G | ~2.003 G | ~1.857 G | | RSS (360s idle) | ~1.932 G | ~2.003 G | ~1.857 G | [`lobehub/lobe-chat`](https://github.com/lobehub/lobe-chat) | Metric | vite 7 + esbuild | vite 8 + [email protected] | vite 8 + local + `v3` | |--------|------------------|--------------------------|------------------------| | Σ per-proc peak (upper bound) | ~6.900 G | ~7.200 G | ~7.800 G | | Physical footprint (after browser close) | ~6.800 G | ~7.000 G | ~3.700 G | | Physical footprint (90s idle) | ~5.887 G | ~5.700 G | ~2.400 G | | Physical footprint (180s idle) | ~5.860 G | ~5.700 G | ~2.300 G | | Physical footprint (270s idle) | ~0.882 G | ~5.700 G | ~2.300 G | | Physical footprint (360s idle) | ~0.882 G | ~5.700 G | ~2.300 G | | RSS (after browser close) | ~7.057 G | ~8.642 G | ~8.385 G | | RSS (90s idle) | ~6.187 G | ~7.505 G | ~7.311 G | | RSS (180s idle) | ~6.156 G | ~7.505 G | ~7.241 G | | RSS (270s idle) | ~6.156 G | ~7.462 G | ~7.241 G | | RSS (360s idle) | ~6.156 G | ~7.462 G | ~7.241 G | [`lobehub/lobehub`](https://github.com/lobehub/lobehub) | Metric | vite 7 + esbuild | vite 8 + [email protected] | vite 8 + local + `v3` | |--------|------------------|--------------------------|------------------------| | Σ per-proc peak (upper bound) | ~7.100 G | ~7.500 G | ~7.700 G | | Physical footprint (after browser close) | ~7.100 G | ~7.300 G | ~3.300 G | | Physical footprint (90s idle) | ~3.236 G | ~6.000 G | ~2.200 G | | Physical footprint (180s idle) | ~3.236 G | ~6.000 G | ~2.200 G | | Physical footprint (270s idle) | ~0.884 G | ~6.000 G | ~2.200 G | | Physical footprint (360s idle) | ~0.884 G | ~6.000 G | ~2.200 G | | RSS (after browser close) | ~7.252 G | ~8.767 G | ~8.183 G | | RSS (90s idle) | ~6.367 G | ~7.548 G | ~7.220 G | | RSS (180s idle) | ~6.367 G | ~7.548 G | ~7.220 G | | RSS (270s idle) | ~6.313 G | ~7.505 G | ~7.178 G | | RSS (360s idle) | ~6.313 G | ~7.505 G | ~7.178 G | ## Caveats - **Windows is not covered by this PR.** `libmimalloc-sys2/build.rs` has a separate `build_mimalloc_win()` path that hard-codes `./c_src/mimalloc/` (v2) regardless of the `v3` feature. The Cargo manifest still requests `v3` so it activates automatically once upstream fixes the Windows branch; until then Windows users continue on v2. <details> <summary><code>scripts/measure.mjs</code></summary> ```js import { chromium } from '@playwright/test'; import { execFile, spawn } from 'node:child_process'; import { existsSync } from 'node:fs'; import { promisify } from 'node:util'; const execFileAsync = promisify(execFile); const sleep = (ms) => new Promise((r) => setTimeout(r, ms)); const SAMPLE_INTERVAL_MS = 250; const IDLE_AFTER_CLOSE_MS = 90_000; const IDLE_SAMPLES = 4; const server = spawn( 'pnpm', ['exec', 'vite', '--port', '9876', '--host', '127.0.0.1', '--force'], { stdio: ['ignore', 'pipe', 'pipe'] }, ); const waitReady = new Promise((resolve, reject) => { const timeout = setTimeout(() => reject(new Error('vite ready timeout')), 90_000); server.stdout.on('data', (data) => { const text = data.toString(); process.stdout.write(text); if (text.includes('ready in')) { clearTimeout(timeout); resolve(); } }); server.stderr.on('data', (data) => process.stderr.write(data)); server.on('exit', (code) => reject(new Error(`vite exited before ready: ${code}`))); }); const getVitePid = async () => { const { stdout } = await execFileAsync('lsof', ['-tiTCP:9876', '-sTCP:LISTEN']); const pid = Number(stdout.trim().split('\n')[0]); if (!pid) throw new Error('cannot find vite pid listening on 9876'); return pid; }; const allDescendants = async (rootPid) => { const result = [rootPid]; const stack = [rootPid]; while (stack.length) { const p = stack.pop(); try { const { stdout } = await execFileAsync('pgrep', ['-P', String(p)]); const children = stdout.trim().split('\n').filter(Boolean).map(Number); for (const c of children) { result.push(c); stack.push(c); } } catch { /* no children */ } } return result; }; const parseSizeMB = (text) => { const match = text.match(/([\d.]+)\s*([KMG])?/); if (!match) return 0; const num = parseFloat(match[1]); const unit = match[2]; if (unit === 'G') return num * 1024; if (unit === 'K') return num / 1024; if (unit === 'M') return num; return num / (1024 * 1024); }; const procInfo = async (pid) => { try { const [vm, psOut] = await Promise.all([ execFileAsync('vmmap', ['-summary', String(pid)]), execFileAsync('ps', ['-o', 'ppid=,rss=,command=', '-p', String(pid)]), ]); const [, ppidStr, rssStr, command] = psOut.stdout.trim().match(/^\s*(\d+)\s+(\d+)\s+(.*)$/); const ppid = Number(ppidStr); const rss = Number(rssStr); const lines = vm.stdout.split('\n'); const fpLine = lines.find((l) => l.trim().startsWith('Physical footprint:')) || ''; const peakLine = lines.find((l) => l.trim().startsWith('Physical footprint (peak):')) || ''; return { pid, ppid, alive: true, cmd: (command.split(/\s+/)[0] || '').split('/').pop(), command, rssMB: rss / 1024, footprintMB: parseSizeMB(fpLine.split(':')[1] || ''), peakMB: parseSizeMB(peakLine.split(':')[1] || ''), }; } catch { return { pid, alive: false }; } }; const sampleOnce = async (rootPid) => { const pids = await allDescendants(rootPid); const procs = (await Promise.all(pids.map(procInfo))).filter((p) => p.alive); const total = procs.reduce((s, p) => s + p.footprintMB, 0); return { ts: Date.now(), procs, total }; }; let peakSample = null; let stop = false; let vitePid = null; const samplerLoop = async () => { while (!stop) { if (vitePid) { const s = await sampleOnce(vitePid).catch(() => null); if (s && (!peakSample || s.total > peakSample.total)) { peakSample = s; } } if (!stop) await sleep(SAMPLE_INTERVAL_MS); } }; const fmtGB = (mb) => `${(mb / 1024).toFixed(3).padStart(8)} GB`; const dumpSample = (label, sample) => { console.log(`\n=== ${label} ===`); console.log(` ts=${new Date(sample.ts).toISOString()} procs=${sample.procs.length}`); for (const p of sample.procs) { console.log(`\n pid=${p.pid} ppid=${p.ppid} (${p.cmd})`); console.log(` command : ${p.command}`); console.log(` rss : ${fmtGB(p.rssMB)}`); console.log(` Physical footprint : ${fmtGB(p.footprintMB)}`); console.log(` Physical footprint^ : ${fmtGB(p.peakMB)}`); } const totalRss = sample.procs.reduce((s, p) => s + p.rssMB, 0); const totalPeak = sample.procs.reduce((s, p) => s + p.peakMB, 0); console.log(`\n TOTAL`); console.log(` rss : ${fmtGB(totalRss)} (shared pages double-counted)`); console.log(` Physical footprint : ${fmtGB(sample.total)} <-- accurate`); console.log(` Σ per-proc peak : ${fmtGB(totalPeak)} (upper bound, not a real instant)`); }; const samplerPromise = samplerLoop(); try { await waitReady; vitePid = await getVitePid(); const chromePath = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'; const browser = await chromium.launch({ args: ['--no-sandbox', '--disable-gpu'], executablePath: existsSync(chromePath) ? chromePath : undefined, headless: true, }); const page = await browser.newPage({ viewport: { width: 1280, height: 800 } }); await page.goto('http://127.0.0.1:9876/', { waitUntil: 'load', timeout: 120_000 }); await page.waitForTimeout(25_000); await browser.close(); console.log(`\nsample interval: ${SAMPLE_INTERVAL_MS} ms`); dumpSample('after browser.close()', await sampleOnce(vitePid)); for (let i = 1; i <= IDLE_SAMPLES; i++) { await sleep(IDLE_AFTER_CLOSE_MS); const label = `steady state (after ${(IDLE_AFTER_CLOSE_MS * i) / 1000}s idle)`; dumpSample(label, await sampleOnce(vitePid)); } stop = true; await samplerPromise; } finally { stop = true; server.kill('SIGTERM'); } ``` </details> --- # Why switching to mimalloc v3 helps Rolldown ## TL;DR Switching `mimalloc-safe` to the `v3` feature drops Physical footprint by roughly **47%** in our dev-server reproduction (1946 MiB → 1024 MiB), with no observable downside on the same workload. The improvement is not specific to dev-server. Any Rolldown workload that involves multiple worker threads, multiple build phases, or repeated builds in one process benefits. ## Why v3 wins, in one paragraph mimalloc v2 keeps every mimalloc page strictly thread-private. A page can only ever be reused by the thread that allocated it. The only cross-thread sharing path is "thread exits, abandon the whole 32 MiB segment to a global list", which almost never fires in long-running processes that use a worker pool (tokio, rayon). mimalloc v3 changes this. As soon as a page becomes full, the default code path (`mi_page_to_full` calls `_mi_page_abandon`) puts the page into a global `pages_abandoned[bin]` bitmap, where any other thread looking for that size class can claim it via an atomic CAS. The upstream comment in `page.c:382` says it directly: *"this is the usual case in order to allow for sharing of memory between theaps"*. The end result is that the per-process committed memory converges from `sum(historical peak per thread)` (v2) toward `max(global concurrent active working set)` (v3). The reduction factor approaches the worker thread count when workloads are imbalanced. ## Rolldown workloads that benefit Every Rolldown invocation has these traits to varying degrees: 1. **Multi-threaded by default**: tokio + rayon worker pools sized around CPU cores 2. **Phased allocation**: parser, AST, transforms, codegen each touch different size classes 3. **Wide size class coverage**: from 16-byte tokens to 64+ KiB chunks 4. **Sparse long-lived survivors**: module cache, Arc'd info, interned strings 5. **Multiple builds per process** in dev-server, watch, test suites, library bundling pipelines Whether it is the Rolldown CLI, Vite production builds, the dev-server with HMR, or watch mode, these traits hold. Dev-server simply pushes every dimension to the extreme, which is why the improvement is most dramatic there.
IWANABETHATGUY
pushed a commit
that referenced
this pull request
May 18, 2026
## [1.0.1] - 2026-05-13 ### 🚀 Features - experimental/lazy-barrel: advice on oversized barrel modules (#9236) by @shulaoda - rolldown: inline optional-chain enum access (#9379) by @Dunqing - chunk-optimization: dedupe already-loaded dynamic deps (#9305) by @IWANABETHATGUY - binding: call moduleParsed hook in ParallelJsPlugin (#9318) by @jaehafe ### 🐛 Bug Fixes - transform: enable `enum_eval` for `transformSync` and vite TS transform (#9325) by @Dunqing - error: remove severity prefix from diagnostic messages (#9262) by @Kyujenius - deps: pin pnpm to 10.23.0 to work around catalog mismatch on Netlify (#9364) by @shulaoda - ci: pin mimalloc-safe to 0.1.58 (#9361) by @shulaoda - dev/lazy: fix exports of lazy requests in lazy chunks (#9249) by @h-a-n-a - rolldown_plugin_vite_resolve: handle errors in `resolveSubpathImports` callback (#9355) by @sapphi-red - rolldown_plugin_lazy_compilation: use loadExports for fetched proxy to preserve original export names (#9132) by @h-a-n-a - common: include offending index in HybridIndexVec panic message (#9296) by @SAY-5 ### 🚜 Refactor - ecmascript: extract semantic_builder_for_transform helper (#9326) by @Dunqing - test: extract reusable static-import-cycle helper (#9332) by @IWANABETHATGUY ### 📚 Documentation - clarify scope of `topLevelVar` (#9380) by @IWANABETHATGUY - meta/design: add ast-mutation design doc (#9338) by @hyf0 - feat: add ai policy in contribution guide (#9315) by @mdong1909 ### ⚡ Performance - binding: enable mimalloc v3 to reduce idle memory (#9349) by @shulaoda ### 🧪 Testing - mcs: cover require() in `$initial` group (#9376) by @hyf0 - add regression for CJS facade chunk merge into entry (#9351) by @IWANABETHATGUY ### ⚙️ Miscellaneous Tasks - switch prepare-release to manual dispatch with version input (#9383) by @shulaoda - migrate `@rolldown/pluginutils` to `rolldown/plugins` (#9317) by @shulaoda - deps: pin libmimalloc-sys2 to 0.1.54 (#9372) by @shulaoda - replace `igorskyflyer/action-readfile` with `cat` (#9369) by @sapphi-red - deps: update test262 submodule for tests (#9371) by @rolldown-guard[bot] - use app token for test dep update PRs (#9368) by @sapphi-red - replace some actions with gh commands (#9367) by @sapphi-red - replace action-semantic-pull-request with inline regex (#9366) by @sapphi-red - remove pull_request_target workflows (#9188) by @Boshen - deps: upgrade oxc to 0.130.0 (#9360) by @shulaoda - deps: update github actions (major) (#9348) by @renovate[bot] - deps: update github actions (#9341) by @renovate[bot] - deps: update rust crates (#9344) by @renovate[bot] - deps: update crate-ci/typos action to v1.46.1 (#9357) by @renovate[bot] - deps: update npm packages (#9343) by @renovate[bot] - deps: update pnpm to v10.33.4 (#9347) by @renovate[bot] - deps: update dependency rolldown-plugin-dts to ^0.25.0 (#9346) by @renovate[bot] - .claude: add rolldown-repl encoder, rename decode skill (#9352) by @IWANABETHATGUY - deps: update crate-ci/typos action to v1.46.0 (#9345) by @renovate[bot] - deps: update napi to v3.8.6 (#9342) by @renovate[bot] - deps: update dependency vite-plus to v0.1.20 (#9340) by @renovate[bot] - enable rollup chunking-form test (#9335) by @IWANABETHATGUY - typo: fix typo in watcher options comment (#9324) by @thescripted ### ❤️ New Contributors * @Kyujenius made their first contribution in [#9262](#9262) * @SAY-5 made their first contribution in [#9296](#9296) * @thescripted made their first contribution in [#9324](#9324) Co-authored-by: shulaoda <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mimalloc-safefrom 0.1.52 to 0.1.59 at the workspace level.v3feature on everymimalloc-safedependency block insidecrates/rolldown_binding/Cargo.tomlso the binding ships against mimalloc v3.libmimalloc-sys2/build.rsswitch betweenc_src/mimalloc/(v2) andc_src/mimalloc3/(v3).Why
Per the investigation on #9330, the steady-state memory of
vite devon a non-trivial workload (lobe-chat, lobehub-ui) is dominated by mimalloc retaining pages it has already freed. v2 on macOS effectively never returns them: a single live object pins a 64 MB segment, and the purge-delay path is conservative. v3 reworks segments into smaller sub-pages and simplifies the purge timer, so empty regions actually getmadvise(MADV_FREE_REUSABLE)-released on the order of seconds.Measured impact
cijiugechu/rolldown-9330-reprov3lobehub/lobe-chatv3lobehub/lobehubv3Caveats
libmimalloc-sys2/build.rshas a separatebuild_mimalloc_win()path that hard-codes./c_src/mimalloc/(v2) regardless of thev3feature. The Cargo manifest still requestsv3so it activates automatically once upstream fixes the Windows branch; until then Windows users continue on v2.scripts/measure.mjsWhy switching to mimalloc v3 helps Rolldown
TL;DR
Switching
mimalloc-safeto thev3feature drops Physical footprint by roughly 47% in our dev-server reproduction (1946 MiB → 1024 MiB), with no observable downside on the same workload. The improvement is not specific to dev-server. Any Rolldown workload that involves multiple worker threads, multiple build phases, or repeated builds in one process benefits.Why v3 wins, in one paragraph
mimalloc v2 keeps every mimalloc page strictly thread-private. A page can only ever be reused by the thread that allocated it. The only cross-thread sharing path is "thread exits, abandon the whole 32 MiB segment to a global list", which almost never fires in long-running processes that use a worker pool (tokio, rayon).
mimalloc v3 changes this. As soon as a page becomes full, the default code path (
mi_page_to_fullcalls_mi_page_abandon) puts the page into a globalpages_abandoned[bin]bitmap, where any other thread looking for that size class can claim it via an atomic CAS. The upstream comment inpage.c:382says it directly: "this is the usual case in order to allow for sharing of memory between theaps".The end result is that the per-process committed memory converges from
sum(historical peak per thread)(v2) towardmax(global concurrent active working set)(v3). The reduction factor approaches the worker thread count when workloads are imbalanced.Rolldown workloads that benefit
Every Rolldown invocation has these traits to varying degrees:
Whether it is the Rolldown CLI, Vite production builds, the dev-server with HMR, or watch mode, these traits hold. Dev-server simply pushes every dimension to the extreme, which is why the improvement is most dramatic there.