feat(rust/mcs): support entriesAwareMergeThreshold#8312
Conversation
How to use the Graphite Merge QueueAdd the label graphite: merge-when-ready to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Pull request overview
This PR adds support for entriesAwareMergeThreshold option in the manual code splitting feature. This option works in conjunction with entriesAware: true to automatically merge small entry-aware chunk groups into larger neighboring groups based on a size threshold, reducing the number of output chunks while maintaining reasonable chunk sizes.
Changes:
- Added
entriesAwareMergeThresholdoption to manual code splitting configuration - Implemented merge algorithm that uses a min-heap to process unqualified groups and merges them into best-fit neighboring groups based on symmetric difference of entry bits
- Added comprehensive test case demonstrating the feature with three shared modules
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
crates/rolldown_testing/_config.schema.json |
Added JSON schema definition for the new entriesAwareMergeThreshold field |
crates/rolldown_common/src/inner_bundler_options/types/manual_code_splitting_options.rs |
Added entries_aware_merge_threshold field to MatchGroup struct |
crates/rolldown_binding/src/options/binding_output_options/binding_manual_code_splitting_options.rs |
Added binding field for TypeScript/JavaScript interop |
crates/rolldown_binding/src/utils/normalize_binding_options.rs |
Updated normalization logic to pass through the new field |
crates/rolldown/src/stages/generate_stage/manual_code_splitting.rs |
Core implementation: tracking, merging logic, OrderedSize wrapper, and helper functions |
crates/rolldown/tests/rolldown/function/advanced_chunks/entries_aware_merge_threshold_basic/* |
Test case with three shared modules and expected output snapshots |
crates/rolldown_common/src/inner_bundler_options/types/manual_code_splitting_options.rs
Show resolved
Hide resolved
...rolldown_binding/src/options/binding_output_options/binding_manual_code_splitting_options.rs
Show resolved
Hide resolved
Benchmarks Rust |
0dd6e74 to
da81b99
Compare
fc81605 to
1e117f0
Compare
entriesAwareMergeThreaholdentriesAwareMergeThreshold
1e117f0 to
c7384ad
Compare
1da7ac7 to
f56d7cf
Compare
c7384ad to
7f28c3f
Compare
f56d7cf to
02e2f94
Compare
02e2f94 to
ef9d902
Compare
✅ Deploy Preview for rolldown-rs canceled.
|
Merge activity
|
## Summary
This PR adds `entriesAwareMergeThreshold` for advanced chunk groups (`manualCodeSplitting.groups[]`).
It is designed specifically for `entriesAware` mode, where we want to keep the correctness and fetch-precision benefits of entry-aware splitting, but avoid exploding into too many tiny chunks.
---
## What `entriesAware` does
When `entriesAware: true` is enabled on a group, matched modules are split by **entry reachability** (entry-bitset).
That means:
- modules reached by exactly the same entry set go to the same subgroup/chunk
- modules with different reachability bits go to different subgroups/chunks
Example:
- `{A,B,C}`-reachable modules -> one subgroup
- `{A,B}`-reachable modules -> another subgroup
- `{A}`-reachable modules -> another subgroup
### Benefits
- **Most precise loading behavior**: each entry loads only the shared pieces it actually needs.
- **Less over-fetching** compared with coarse “one shared vendor chunk” strategies.
- Works naturally with rolldown’s singleton module model.
### Downsides
- Can generate many small subgroups/chunks, especially in multi-entry apps.
- Too many tiny chunks increase request overhead and scheduling overhead.
- Chunk graph may become noisy/harder to reason about for users.
---
## What `entriesAwareMergeThreshold` does
New per-group option:
```js
manualCodeSplitting: {
groups: [
{
name: 'vendor',
test: /.../,
entriesAware: true,
entriesAwareMergeThreshold: 28000
}
]
}
```
Behavior:
- only applies to `entriesAware` subgroups
- if subgroup `size < entriesAwareMergeThreshold`, it is considered **unqualified**
- unqualified subgroups are **merged** into a better sibling subgroup (same origin group), instead of being left as standalone tiny chunks
---
## How merge chooses a target
For each unqualified subgroup candidate, choose target by:
1. **lowest extra-entry count** (`|candidate-target| + |target-candidate|` on entry bits)
2. if tie: **smaller target size**
3. if tie: stable deterministic key order
This is intentional: the merge heuristic is not only about reducing chunk count, but also about keeping **unnecessary extra download** as low as possible after merging.
By preferring targets with fewer extra entries, we try to merge into the closest reachability neighbor so fewer entries pay for modules they do not strictly need.
---
## Recommended splitting pattern with `maxSize`
`entriesAwareMergeThreshold` and `maxSize` are intended to work together:
1. Use `entriesAwareMergeThreshold` to merge tiny entries-aware subgroups and reduce micro-chunk fragmentation.
2. Keep/use `maxSize` to split any merged result that becomes too large.
So even if merging creates a big group, `maxSize` can re-split it into smaller chunks in the existing splitting phase.
This is the recommended chunk-splitting pattern in practice:
- **merge tiny chunks first** (reduce request overhead)
- **then cap large chunks** (control payload size)
- while still minimizing additional over-fetch / unnecessary download introduced by merge
---
## Relationship with existing size options
- `minSize` and `maxSize` semantics remain unchanged.
- `entriesAwareMergeThreshold` runs as an entries-aware pre-merge step.
- `maxSize` is not a hard constraint in merge target selection.
So this is additive behavior targeted at entries-aware subgroup consolidation.
---
## Why this helps
`entriesAware` gives the best fetch precision but can be too fragmented.
`entriesAwareMergeThreshold` provides a practical middle ground:
- keep entry-aware structure
- reduce tiny subgroup count
- improve chunk graph practicality for real-world apps
In short: **preserve most of entries-aware correctness/precision while reducing micro-chunk overhead**.
ef9d902 to
a998e81
Compare
## [1.0.0-rc.5] - 2026-02-18
💡 Smarter `entriesAware` Manual Code Splitting
New `entriesAware` and `entriesAwareMergeThreshold` options for `manualCodeSplitting.groups[]` enable
entry-reachability-based chunk splitting with automatic small chunk merging.
- `entriesAware: true` splits matched modules by entry reachability — modules reached by the same set of entries are grouped together, providing the most precise loading behavior with less over-fetching
- Chunks now get more readable names reflecting their entry associations (e.g. `vendor-entry-a-entry-b.js`) instead of opaque hashes
- `entriesAwareMergeThreshold` sets a byte-size threshold to merge tiny subgroups into the closest sibling with the fewest extra entries, reducing micro-chunk fragmentation while preserving precision
- Recommended to use together with `maxSize`: merge tiny chunks first to reduce request overhead, then cap large chunks to control payload size
```js
manualCodeSplitting: {
groups: [{
name: 'vendor',
test: /node_modules/,
entriesAware: true,
entriesAwareMergeThreshold: 28000, // bytes
}]
}
```
### 🚀 Features
- add `Visitor` to `rolldown/utils` (#8373) by @sapphi-red
- module-info: add `inputFormat` property to `ModuleInfo` (#8329) by @shulaoda
- default `treeshake.invalid_import_side_effects` to `false` (#8357) by @sapphi-red
- rolldown_utils: add `IndexBitSet` (#8343) by @sapphi-red
- rolldown_utils: add more methods and trait impls to BitSet (#8342) by @sapphi-red
- rolldown_plugin_vite_build_import_analysis: add support for `await import().then((m) => m.prop)` (#8328) by @sapphi-red
- rolldown_plugin_vite_reporter: support custom logger for build infos (#7652) by @shulaoda
- rust/mcs: support `entriesAwareMergeThreshold` (#8312) by @hyf0
- mcs: `maxSize` will split the oversized chunk with taking file relevance into account (#8277) by @hyf0
- rolldown_plugin_vite_import_glob: support template literal in glob import patterns (#8298) by @shulaoda
- rolldown_plugin_chunk_import_map: output importmap without spaces (#8297) by @sapphi-red
- add INEFFECTIVE_DYNAMIC_IMPORT warning in core (#8284) by @shulaoda
- mcs: generate more readable name for `entriesAware` chunks (#8275) by @hyf0
- mcs: support `entriesAware` (#8274) by @hyf0
### 🐛 Bug Fixes
- improve circular dependency detection in chunk optimizer (#8371) by @IWANABETHATGUY
- align `minify.compress: true` and `minify.mangle: true` with `minify: true` (#8367) by @sapphi-red
- rolldown_plugin_esm_external_require: apply conversion to UMD and IIFE outputs (#8359) by @sapphi-red
- cjs: bailout treeshaking on cjs modules that have multiple re-exports (#8348) by @hyf0
- handle member expression and this expression in JSX element name rewriting (#8323) by @IWANABETHATGUY
- pad `encode_hash_with_base` output to fixed length to prevent slice panics (#8320) by @shulaoda
- `xxhash_with_base` skips hashing when input is exactly 16 bytes (#8319) by @shulaoda
- complete `ImportKind::try_from` with missing variants and correct `url-import` to `url-token` (#8310) by @shulaoda
- mark Node.js builtin modules as side-effect-free when resolved via `external` config (#8304) by @IWANABETHATGUY
- mcs: `maxSize` should split chunks correctly based on sizes (#8289) by @hyf0
### 🚜 Refactor
- introduce `RawMangleOptions` and `RawCompressOptions` (#8366) by @sapphi-red
- mcs: refactor `apply_manual_code_splitting` into `ManualSplitter` (#8346) by @hyf0
- rolldown_plugin_vite_reporter: simplify hook registration and remove redundant state (#8322) by @shulaoda
- use set to store user defined entry modules (#8315) by @IWANABETHATGUY
- rust/mcs: collect groups into map at first for having clean and performant operations (#8313) by @hyf0
- mcs: introduce newtype `ModuleGroupOrigin` and `ModuleGroupId` (#8311) by @hyf0
- remove unnecessary `FinalizerMutableState` struct (#8303) by @shulaoda
- move module finalization into `finalize_modules` (#8302) by @shulaoda
- extract `apply_transfer_parts_mutation` into its own module (#8301) by @shulaoda
- move ESM format check into `determine_export_mode` (#8294) by @shulaoda
- remove `warnings` field from `GenerateContext` (#8293) by @shulaoda
- extract util function remove clippy supression (#8290) by @IWANABETHATGUY
- move `is_in_node_modules` to `PathExt` trait in `rolldown_std_utils` (#8286) by @shulaoda
- rolldown_plugin_vite_reporter: remove unnecessary ineffective dynamic import detection logic (#8285) by @shulaoda
- dev: inject hmr runtime to `\0rolldown/runtime.js` (#8234) by @hyf0
- improve naming in chunk_optimizer (#8287) by @IWANABETHATGUY
- simplify PostChunkOptimizationOperation from bitflags to enum (#8283) by @IWANABETHATGUY
- optimize BitSet.index_of_one to return iterator instead of Vec (#8282) by @IWANABETHATGUY
### 📚 Documentation
- change default value in `format` JSDoc from `'esm'` to `'es'` (#8372) by @shulaoda
- in-depth: remove `invalidImportSideEffects` option mention from lazy barrel optimization doc (#8355) by @sapphi-red
- mcs: clarify `minSize` constraints (#8279) by @ShroXd
### ⚡ Performance
- use IndexVec for chunk TLA detection (#8341) by @sapphi-red
- only invoke single resolve call for the same specifier and import kind (#8332) by @sapphi-red
- rolldown_plugin_vite_reporter: skip gzip computation when `report_compressed_size` is disabled (#8321) by @shulaoda
### 🧪 Testing
- use `vi.waitFor` and `expect.poll` instead of custom `waitUtil` function (#8369) by @sapphi-red
- rolldown_plugin_esm_external_require_plugin: add tests (#8358) by @sapphi-red
- add watch file tests (#8330) by @sapphi-red
- rolldown_plugin_vite_build_import_analysis: add test for dynamic import treeshaking (#8327) by @sapphi-red
### ⚙️ Miscellaneous Tasks
- prepare-release: skip workflow on forked repositories (#8368) by @shulaoda
- format more files (#8360) by @sapphi-red
- deps: update oxc to v0.114.0 (#8347) by @camc314
- deps: update test262 submodule for tests (#8354) by @sapphi-red
- deps: update crate-ci/typos action to v1.43.5 (#8350) by @renovate[bot]
- deps: update oxc apps (#8351) by @renovate[bot]
- rolldown_plugin_vite_reporter: remove unnecessary README.md (#8334) by @shulaoda
- deps: update npm packages (#8338) by @renovate[bot]
- deps: update rust crates (#8339) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.13.0 (#8337) by @renovate[bot]
- deps: update github-actions (#8336) by @renovate[bot]
- deps: update napi to v3.8.3 (#8331) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.12.2 (#8325) by @renovate[bot]
- remove unnecessary transform.decorator (#8314) by @IWANABETHATGUY
- deps: update dependency rust to v1.93.1 (#8305) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.12.1 (#8300) by @renovate[bot]
- deps: update oxc apps (#8296) by @renovate[bot]
- docs: don't skip for build runs without cache (#8281) by @sapphi-red

Summary
This PR adds
entriesAwareMergeThresholdfor advanced chunk groups (manualCodeSplitting.groups[]).It is designed specifically for
entriesAwaremode, where we want to keep the correctness and fetch-precision benefits of entry-aware splitting, but avoid exploding into too many tiny chunks.What
entriesAwaredoesWhen
entriesAware: trueis enabled on a group, matched modules are split by entry reachability (entry-bitset).That means:
Example:
{A,B,C}-reachable modules -> one subgroup{A,B}-reachable modules -> another subgroup{A}-reachable modules -> another subgroupBenefits
Downsides
What
entriesAwareMergeThresholddoesNew per-group option:
Behavior:
entriesAwaresubgroupssize < entriesAwareMergeThreshold, it is considered unqualifiedHow merge chooses a target
For each unqualified subgroup candidate, choose target by:
|candidate-target| + |target-candidate|on entry bits)This is intentional: the merge heuristic is not only about reducing chunk count, but also about keeping unnecessary extra download as low as possible after merging.
By preferring targets with fewer extra entries, we try to merge into the closest reachability neighbor so fewer entries pay for modules they do not strictly need.
Recommended splitting pattern with
maxSizeentriesAwareMergeThresholdandmaxSizeare intended to work together:entriesAwareMergeThresholdto merge tiny entries-aware subgroups and reduce micro-chunk fragmentation.maxSizeto split any merged result that becomes too large.So even if merging creates a big group,
maxSizecan re-split it into smaller chunks in the existing splitting phase.This is the recommended chunk-splitting pattern in practice:
Relationship with existing size options
minSizeandmaxSizesemantics remain unchanged.entriesAwareMergeThresholdruns as an entries-aware pre-merge step.maxSizeis not a hard constraint in merge target selection.So this is additive behavior targeted at entries-aware subgroup consolidation.
Why this helps
entriesAwaregives the best fetch precision but can be too fragmented.entriesAwareMergeThresholdprovides a practical middle ground:In short: preserve most of entries-aware correctness/precision while reducing micro-chunk overhead.