Skip to content

Comments

feat(rust/mcs): support entriesAwareMergeThreshold#8312

Merged
graphite-app[bot] merged 1 commit intomainfrom
02-13-feat_rust_mcs_support_entriesawaremergethreahold_
Feb 13, 2026
Merged

feat(rust/mcs): support entriesAwareMergeThreshold#8312
graphite-app[bot] merged 1 commit intomainfrom
02-13-feat_rust_mcs_support_entriesawaremergethreahold_

Conversation

@hyf0
Copy link
Member

@hyf0 hyf0 commented Feb 13, 2026

Summary

This PR adds entriesAwareMergeThreshold for advanced chunk groups (manualCodeSplitting.groups[]).

It is designed specifically for entriesAware mode, where we want to keep the correctness and fetch-precision benefits of entry-aware splitting, but avoid exploding into too many tiny chunks.


What entriesAware does

When entriesAware: true is enabled on a group, matched modules are split by entry reachability (entry-bitset).

That means:

  • modules reached by exactly the same entry set go to the same subgroup/chunk
  • modules with different reachability bits go to different subgroups/chunks

Example:

  • {A,B,C}-reachable modules -> one subgroup
  • {A,B}-reachable modules -> another subgroup
  • {A}-reachable modules -> another subgroup

Benefits

  • Most precise loading behavior: each entry loads only the shared pieces it actually needs.
  • Less over-fetching compared with coarse “one shared vendor chunk” strategies.
  • Works naturally with rolldown’s singleton module model.

Downsides

  • Can generate many small subgroups/chunks, especially in multi-entry apps.
  • Too many tiny chunks increase request overhead and scheduling overhead.
  • Chunk graph may become noisy/harder to reason about for users.

What entriesAwareMergeThreshold does

New per-group option:

manualCodeSplitting: {
  groups: [
    {
      name: 'vendor',
      test: /.../,
      entriesAware: true,
      entriesAwareMergeThreshold: 28000
    }
  ]
}

Behavior:

  • only applies to entriesAware subgroups
  • if subgroup size < entriesAwareMergeThreshold, it is considered unqualified
  • unqualified subgroups are merged into a better sibling subgroup (same origin group), instead of being left as standalone tiny chunks

How merge chooses a target

For each unqualified subgroup candidate, choose target by:

  1. lowest extra-entry count (|candidate-target| + |target-candidate| on entry bits)
  2. if tie: smaller target size
  3. if tie: stable deterministic key order

This is intentional: the merge heuristic is not only about reducing chunk count, but also about keeping unnecessary extra download as low as possible after merging.

By preferring targets with fewer extra entries, we try to merge into the closest reachability neighbor so fewer entries pay for modules they do not strictly need.


Recommended splitting pattern with maxSize

entriesAwareMergeThreshold and maxSize are intended to work together:

  1. Use entriesAwareMergeThreshold to merge tiny entries-aware subgroups and reduce micro-chunk fragmentation.
  2. Keep/use maxSize to split any merged result that becomes too large.

So even if merging creates a big group, maxSize can re-split it into smaller chunks in the existing splitting phase.

This is the recommended chunk-splitting pattern in practice:

  • merge tiny chunks first (reduce request overhead)
  • then cap large chunks (control payload size)
  • while still minimizing additional over-fetch / unnecessary download introduced by merge

Relationship with existing size options

  • minSize and maxSize semantics remain unchanged.
  • entriesAwareMergeThreshold runs as an entries-aware pre-merge step.
  • maxSize is not a hard constraint in merge target selection.

So this is additive behavior targeted at entries-aware subgroup consolidation.


Why this helps

entriesAware gives the best fetch precision but can be too fragmented.

entriesAwareMergeThreshold provides a practical middle ground:

  • keep entry-aware structure
  • reduce tiny subgroup count
  • improve chunk graph practicality for real-world apps

In short: preserve most of entries-aware correctness/precision while reducing micro-chunk overhead.

Copy link
Member Author

hyf0 commented Feb 13, 2026


How to use the Graphite Merge Queue

Add the label graphite: merge-when-ready to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for entriesAwareMergeThreshold option in the manual code splitting feature. This option works in conjunction with entriesAware: true to automatically merge small entry-aware chunk groups into larger neighboring groups based on a size threshold, reducing the number of output chunks while maintaining reasonable chunk sizes.

Changes:

  • Added entriesAwareMergeThreshold option to manual code splitting configuration
  • Implemented merge algorithm that uses a min-heap to process unqualified groups and merges them into best-fit neighboring groups based on symmetric difference of entry bits
  • Added comprehensive test case demonstrating the feature with three shared modules

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/rolldown_testing/_config.schema.json Added JSON schema definition for the new entriesAwareMergeThreshold field
crates/rolldown_common/src/inner_bundler_options/types/manual_code_splitting_options.rs Added entries_aware_merge_threshold field to MatchGroup struct
crates/rolldown_binding/src/options/binding_output_options/binding_manual_code_splitting_options.rs Added binding field for TypeScript/JavaScript interop
crates/rolldown_binding/src/utils/normalize_binding_options.rs Updated normalization logic to pass through the new field
crates/rolldown/src/stages/generate_stage/manual_code_splitting.rs Core implementation: tracking, merging logic, OrderedSize wrapper, and helper functions
crates/rolldown/tests/rolldown/function/advanced_chunks/entries_aware_merge_threshold_basic/* Test case with three shared modules and expected output snapshots

@github-actions
Copy link
Contributor

github-actions bot commented Feb 13, 2026

Benchmarks Rust

  • target: main(7f28c3f)
  • pr: 02-13-feat_rust_mcs_support_entriesawaremergethreahold_(ef9d902)
group                                                        pr                                     target
-----                                                        --                                     ------
bundle/bundle@multi-duplicated-top-level-symbol              1.00     69.7±1.89ms        ? ?/sec    1.02     71.4±1.85ms        ? ?/sec
bundle/bundle@multi-duplicated-top-level-symbol-sourcemap    1.00     76.1±1.85ms        ? ?/sec    1.02     77.6±1.81ms        ? ?/sec
bundle/bundle@rome_ts                                        1.00    101.5±2.59ms        ? ?/sec    1.00    101.2±1.76ms        ? ?/sec
bundle/bundle@rome_ts-sourcemap                              1.00    112.8±3.58ms        ? ?/sec    1.00    113.0±2.31ms        ? ?/sec
bundle/bundle@threejs                                        1.00     36.2±1.85ms        ? ?/sec    1.00     36.2±0.84ms        ? ?/sec
bundle/bundle@threejs-sourcemap                              1.00     40.7±0.72ms        ? ?/sec    1.02     41.4±0.91ms        ? ?/sec
bundle/bundle@threejs10x                                     1.00    362.3±4.76ms        ? ?/sec    1.02    371.1±8.03ms        ? ?/sec
bundle/bundle@threejs10x-sourcemap                           1.00    417.3±3.56ms        ? ?/sec    1.02    424.5±4.76ms        ? ?/sec
scan/scan@rome_ts                                            1.00     80.6±1.73ms        ? ?/sec    1.00     80.7±1.74ms        ? ?/sec
scan/scan@threejs                                            1.02     28.3±1.66ms        ? ?/sec    1.00     27.7±0.43ms        ? ?/sec
scan/scan@threejs10x                                         1.00    275.3±4.61ms        ? ?/sec    1.00    274.5±3.99ms        ? ?/sec

@hyf0 hyf0 force-pushed the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch from 0dd6e74 to da81b99 Compare February 13, 2026 05:07
@hyf0 hyf0 force-pushed the 02-12-refactor_mcs_introduce_newtype_modulegrouporigin_and_modulegroupid_ branch from fc81605 to 1e117f0 Compare February 13, 2026 05:07
@hyf0 hyf0 changed the title feat(rust/mcs): support entriesAwareMergeThreahold feat(rust/mcs): support entriesAwareMergeThreshold Feb 13, 2026
@hyf0 hyf0 changed the title feat(rust/mcs): support entriesAwareMergeThreshold feat(rust/mcs): support entriesAwareMergeThreshold Feb 13, 2026
@hyf0 hyf0 force-pushed the 02-12-refactor_mcs_introduce_newtype_modulegrouporigin_and_modulegroupid_ branch from 1e117f0 to c7384ad Compare February 13, 2026 05:33
@hyf0 hyf0 force-pushed the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch 3 times, most recently from 1da7ac7 to f56d7cf Compare February 13, 2026 06:33
@graphite-app graphite-app bot changed the base branch from 02-12-refactor_mcs_introduce_newtype_modulegrouporigin_and_modulegroupid_ to graphite-base/8312 February 13, 2026 06:34
@graphite-app graphite-app bot force-pushed the graphite-base/8312 branch from c7384ad to 7f28c3f Compare February 13, 2026 06:47
@graphite-app graphite-app bot force-pushed the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch from f56d7cf to 02e2f94 Compare February 13, 2026 06:47
@graphite-app graphite-app bot changed the base branch from graphite-base/8312 to main February 13, 2026 06:47
@graphite-app graphite-app bot force-pushed the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch from 02e2f94 to ef9d902 Compare February 13, 2026 06:48
@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for rolldown-rs canceled.

Name Link
🔨 Latest commit a998e81
🔍 Latest deploy log https://app.netlify.com/projects/rolldown-rs/deploys/698ed474e3db6500089a0c5a

@graphite-app
Copy link
Contributor

graphite-app bot commented Feb 13, 2026

Merge activity

## Summary

This PR adds `entriesAwareMergeThreshold` for advanced chunk groups (`manualCodeSplitting.groups[]`).

It is designed specifically for `entriesAware` mode, where we want to keep the correctness and fetch-precision benefits of entry-aware splitting, but avoid exploding into too many tiny chunks.

---

## What `entriesAware` does

When `entriesAware: true` is enabled on a group, matched modules are split by **entry reachability** (entry-bitset).

That means:
- modules reached by exactly the same entry set go to the same subgroup/chunk
- modules with different reachability bits go to different subgroups/chunks

Example:
- `{A,B,C}`-reachable modules -> one subgroup
- `{A,B}`-reachable modules -> another subgroup
- `{A}`-reachable modules -> another subgroup

### Benefits

- **Most precise loading behavior**: each entry loads only the shared pieces it actually needs.
- **Less over-fetching** compared with coarse “one shared vendor chunk” strategies.
- Works naturally with rolldown’s singleton module model.

### Downsides

- Can generate many small subgroups/chunks, especially in multi-entry apps.
- Too many tiny chunks increase request overhead and scheduling overhead.
- Chunk graph may become noisy/harder to reason about for users.

---

## What `entriesAwareMergeThreshold` does

New per-group option:

```js
manualCodeSplitting: {
  groups: [
    {
      name: 'vendor',
      test: /.../,
      entriesAware: true,
      entriesAwareMergeThreshold: 28000
    }
  ]
}
```

Behavior:
- only applies to `entriesAware` subgroups
- if subgroup `size < entriesAwareMergeThreshold`, it is considered **unqualified**
- unqualified subgroups are **merged** into a better sibling subgroup (same origin group), instead of being left as standalone tiny chunks

---

## How merge chooses a target

For each unqualified subgroup candidate, choose target by:
1. **lowest extra-entry count** (`|candidate-target| + |target-candidate|` on entry bits)
2. if tie: **smaller target size**
3. if tie: stable deterministic key order

This is intentional: the merge heuristic is not only about reducing chunk count, but also about keeping **unnecessary extra download** as low as possible after merging.

By preferring targets with fewer extra entries, we try to merge into the closest reachability neighbor so fewer entries pay for modules they do not strictly need.

---

## Recommended splitting pattern with `maxSize`

`entriesAwareMergeThreshold` and `maxSize` are intended to work together:

1. Use `entriesAwareMergeThreshold` to merge tiny entries-aware subgroups and reduce micro-chunk fragmentation.
2. Keep/use `maxSize` to split any merged result that becomes too large.

So even if merging creates a big group, `maxSize` can re-split it into smaller chunks in the existing splitting phase.

This is the recommended chunk-splitting pattern in practice:
- **merge tiny chunks first** (reduce request overhead)
- **then cap large chunks** (control payload size)
- while still minimizing additional over-fetch / unnecessary download introduced by merge

---

## Relationship with existing size options

- `minSize` and `maxSize` semantics remain unchanged.
- `entriesAwareMergeThreshold` runs as an entries-aware pre-merge step.
- `maxSize` is not a hard constraint in merge target selection.

So this is additive behavior targeted at entries-aware subgroup consolidation.

---

## Why this helps

`entriesAware` gives the best fetch precision but can be too fragmented.

`entriesAwareMergeThreshold` provides a practical middle ground:
- keep entry-aware structure
- reduce tiny subgroup count
- improve chunk graph practicality for real-world apps

In short: **preserve most of entries-aware correctness/precision while reducing micro-chunk overhead**.
@graphite-app graphite-app bot force-pushed the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch from ef9d902 to a998e81 Compare February 13, 2026 07:36
@graphite-app graphite-app bot merged commit a998e81 into main Feb 13, 2026
34 checks passed
@graphite-app graphite-app bot deleted the 02-13-feat_rust_mcs_support_entriesawaremergethreahold_ branch February 13, 2026 07:46
shulaoda pushed a commit that referenced this pull request Feb 18, 2026
## [1.0.0-rc.5] - 2026-02-18

💡 Smarter `entriesAware` Manual Code Splitting                                                                 
                                             
New `entriesAware` and `entriesAwareMergeThreshold` options for `manualCodeSplitting.groups[]` enable              
entry-reachability-based chunk splitting with automatic small chunk merging.                                       
                                                                                                                   
- `entriesAware: true` splits matched modules by entry reachability — modules reached by the same set of entries are grouped together, providing the most precise loading behavior with less over-fetching
- Chunks now get more readable names reflecting their entry associations (e.g. `vendor-entry-a-entry-b.js`) instead of opaque hashes
- `entriesAwareMergeThreshold` sets a byte-size threshold to merge tiny subgroups into the closest sibling with the fewest extra entries, reducing micro-chunk fragmentation while preserving precision
- Recommended to use together with `maxSize`: merge tiny chunks first to reduce request overhead, then cap large chunks to control payload size

```js
manualCodeSplitting: {
  groups: [{
    name: 'vendor',
    test: /node_modules/,
    entriesAware: true,
    entriesAwareMergeThreshold: 28000, // bytes
  }]
}
```

### 🚀 Features

- add `Visitor` to `rolldown/utils` (#8373) by @sapphi-red
- module-info: add `inputFormat` property to `ModuleInfo` (#8329) by @shulaoda
- default `treeshake.invalid_import_side_effects` to `false` (#8357) by @sapphi-red
- rolldown_utils: add `IndexBitSet` (#8343) by @sapphi-red
- rolldown_utils: add more methods and trait impls to BitSet (#8342) by @sapphi-red
- rolldown_plugin_vite_build_import_analysis: add support for `await import().then((m) => m.prop)` (#8328) by @sapphi-red
- rolldown_plugin_vite_reporter: support custom logger for build infos (#7652) by @shulaoda
- rust/mcs: support `entriesAwareMergeThreshold` (#8312) by @hyf0
- mcs: `maxSize` will split the oversized chunk with taking file relevance into account (#8277) by @hyf0
- rolldown_plugin_vite_import_glob: support template literal in glob import patterns (#8298) by @shulaoda
- rolldown_plugin_chunk_import_map: output importmap without spaces (#8297) by @sapphi-red
- add INEFFECTIVE_DYNAMIC_IMPORT warning in core (#8284) by @shulaoda
- mcs: generate more readable name for `entriesAware` chunks (#8275) by @hyf0
- mcs: support `entriesAware` (#8274) by @hyf0

### 🐛 Bug Fixes

- improve circular dependency detection in chunk optimizer (#8371) by @IWANABETHATGUY
- align `minify.compress: true` and `minify.mangle: true` with `minify: true` (#8367) by @sapphi-red
- rolldown_plugin_esm_external_require: apply conversion to UMD and IIFE outputs (#8359) by @sapphi-red
- cjs: bailout treeshaking on cjs modules that have multiple re-exports (#8348) by @hyf0
- handle member expression and this expression in JSX element name rewriting (#8323) by @IWANABETHATGUY
- pad `encode_hash_with_base` output to fixed length to prevent slice panics (#8320) by @shulaoda
- `xxhash_with_base` skips hashing when input is exactly 16 bytes (#8319) by @shulaoda
- complete `ImportKind::try_from` with missing variants and correct `url-import` to `url-token` (#8310) by @shulaoda
- mark Node.js builtin modules as side-effect-free when resolved via `external` config (#8304) by @IWANABETHATGUY
- mcs: `maxSize` should split chunks correctly based on sizes (#8289) by @hyf0

### 🚜 Refactor

- introduce `RawMangleOptions` and `RawCompressOptions` (#8366) by @sapphi-red
- mcs: refactor `apply_manual_code_splitting` into `ManualSplitter` (#8346) by @hyf0
- rolldown_plugin_vite_reporter: simplify hook registration and remove redundant state (#8322) by @shulaoda
- use set to store user defined entry modules (#8315) by @IWANABETHATGUY
- rust/mcs: collect groups into map at first for having clean and performant operations (#8313) by @hyf0
- mcs: introduce newtype `ModuleGroupOrigin` and `ModuleGroupId` (#8311) by @hyf0
- remove unnecessary `FinalizerMutableState` struct (#8303) by @shulaoda
- move module finalization into `finalize_modules` (#8302) by @shulaoda
- extract `apply_transfer_parts_mutation` into its own module (#8301) by @shulaoda
- move ESM format check into `determine_export_mode` (#8294) by @shulaoda
- remove `warnings` field from `GenerateContext` (#8293) by @shulaoda
- extract util function remove clippy supression (#8290) by @IWANABETHATGUY
- move `is_in_node_modules` to `PathExt` trait in `rolldown_std_utils` (#8286) by @shulaoda
- rolldown_plugin_vite_reporter: remove unnecessary ineffective dynamic import detection logic (#8285) by @shulaoda
- dev: inject hmr runtime to `\0rolldown/runtime.js` (#8234) by @hyf0
- improve naming in chunk_optimizer (#8287) by @IWANABETHATGUY
- simplify PostChunkOptimizationOperation from bitflags to enum (#8283) by @IWANABETHATGUY
- optimize BitSet.index_of_one to return iterator instead of Vec (#8282) by @IWANABETHATGUY

### 📚 Documentation

- change default value in `format` JSDoc from `'esm'` to `'es'` (#8372) by @shulaoda
- in-depth: remove `invalidImportSideEffects` option mention from lazy barrel optimization doc (#8355) by @sapphi-red
- mcs: clarify `minSize` constraints (#8279) by @ShroXd

### ⚡ Performance

- use IndexVec for chunk TLA detection (#8341) by @sapphi-red
- only invoke single resolve call for the same specifier and import kind (#8332) by @sapphi-red
- rolldown_plugin_vite_reporter: skip gzip computation when `report_compressed_size` is disabled (#8321) by @shulaoda

### 🧪 Testing

- use `vi.waitFor` and `expect.poll` instead of custom `waitUtil` function (#8369) by @sapphi-red
- rolldown_plugin_esm_external_require_plugin: add tests (#8358) by @sapphi-red
- add watch file tests (#8330) by @sapphi-red
- rolldown_plugin_vite_build_import_analysis: add test for dynamic import treeshaking (#8327) by @sapphi-red

### ⚙️ Miscellaneous Tasks

- prepare-release: skip workflow on forked repositories (#8368) by @shulaoda
- format more files (#8360) by @sapphi-red
- deps: update oxc to v0.114.0 (#8347) by @camc314
- deps: update test262 submodule for tests (#8354) by @sapphi-red
- deps: update crate-ci/typos action to v1.43.5 (#8350) by @renovate[bot]
- deps: update oxc apps (#8351) by @renovate[bot]
- rolldown_plugin_vite_reporter: remove unnecessary README.md (#8334) by @shulaoda
- deps: update npm packages (#8338) by @renovate[bot]
- deps: update rust crates (#8339) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.13.0 (#8337) by @renovate[bot]
- deps: update github-actions (#8336) by @renovate[bot]
- deps: update napi to v3.8.3 (#8331) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.12.2 (#8325) by @renovate[bot]
- remove unnecessary transform.decorator (#8314) by @IWANABETHATGUY
- deps: update dependency rust to v1.93.1 (#8305) by @renovate[bot]
- deps: update dependency oxlint-tsgolint to v0.12.1 (#8300) by @renovate[bot]
- deps: update oxc apps (#8296) by @renovate[bot]
- docs: don't skip for build runs without cache (#8281) by @sapphi-red
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants