perf(rolldown): re-implement renamer to avoid renaming non-root symbols#4014
perf(rolldown): re-implement renamer to avoid renaming non-root symbols#4014
renamer to avoid renaming non-root symbols#4014Conversation
Benchmarks Rust |
63d6ab3 to
e0c4471
Compare
f667596 to
0fcfb22
Compare
cda9a39 to
2bb9e72
Compare
✅ Deploy Preview for rolldown-rs canceled.
|
d7ac6f5 to
6ae4e9e
Compare
a0ca66a to
63c2e95
Compare
…wn_simplify_renamer_logic
There was a problem hiding this comment.
Copilot reviewed 111 out of 125 changed files in this pull request and generated no comments.
Files not reviewed (14)
- crates/rolldown/tests/esbuild/dce/cross_module_constant_folding_number/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/dce/cross_module_constant_folding_string/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/argument_default_value_scope_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/arguments_special_case_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/arrow_fn_scope/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/comment_preservation_preserve_jsx/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/direct_eval_tainting_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/exports_and_module_format_common_js/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/forbid_string_import_names_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/import_forms_with_minify_identifiers_and_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/import_forms_with_no_bundle/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/import_re_export_es6_issue149/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/inject/artifacts.snap: Language not supported
- crates/rolldown/tests/esbuild/default/inject_jsx_dot_names/artifacts.snap: Language not supported
renamer to avoid renaming non-root symbols
…wn_simplify_renamer_logic
crates/rolldown/src/utils/renamer.rs
Outdated
| .module_used_names | ||
| .entry(canonical_ref.owner) | ||
| .or_insert_with(|| Self::get_module_used_names(self.symbol_db, canonical_ref)) | ||
| .contains(candidate_name.as_str()) |
There was a problem hiding this comment.
How about merge the manual_reserved /used_names/entry_module_used_names/module_used_names into one used_names. It could reduce many hash.
There was a problem hiding this comment.
manual_reserved and entry_module_used_names has removed.
Unfortunately, used_names and module_used_names can't be merged into one due to their different uses.
used_names is a FxHashSet<Rstr> that is used to store names that are used for renaming.
module_used_names is a FxHashSet<&name Rstr> that stores the module used names, and is exactly used in what I said in #4014 (comment).
Can you revert this change? for a bundler responsibility, we should preserve the original definition as much as possible, before |
Good point! Yes, we can. The main reason I did this was to fix https://github.com/rollup/rollup/blob/64870c31062983a741832d8cd2d4bcdf55e4de55/test/function/samples/class-name-conflict-2 test fails. I found this case fails due to the |
Fixing one bug should not introduce another bug, or use a workaround, Let's revert it first, and see if we could solve it in other way. |
Yes, fixed in fc6683f. When |
ccd1bde to
3f59a09
Compare
It no longer happens in the latest commit. |
|
I am hesitant to merge because, according to the snapshot, some of them are improved, others regressed, also lots of the internal code has been changed. I will try to fix the bug based on main branch, If I can't find a better way then I will merge it. |
That's ok! But aside from fixing the bug, there is a 1%-4% performance improvement in this PR. I think this is also worth considering! For the output regression, I know the reason why it's happening, but to fix it, I need to dig deep into Rolldown, which will take forever. If you want to fix it, I’d be happy to pair with you! |
|
It seems like we are in the right direction, but need to be cautious about the changes. Are we able to build this with rolldown-vite and test in ecosystem CI? |
This is the first step towards removing the expensive `rename_non_root_symbol` method. Changes include: 1. Add new fields to Renamer struct: - `entry_module_idx`: Track the entry module index - `entry_module`: Reference to entry module's symbol database - `module_used_names`: Cache of non-root symbol names per module - `used_names`: Track names used during renaming 2. Update `Renamer::new()` to accept `base_module_index` parameter and pre-cache entry module symbol names 3. Add helper methods for the new algorithm: - `is_entry_root_binding()`: Check if name is entry module's root binding - `is_name_available()`: Check if name doesn't conflict with existing names - `get_or_create_module_used_names()`: Lazily compute non-root symbol names These changes prepare for eliminating the expensive nested scope iteration in `rename_non_root_symbol` by making `add_symbol_in_root_scope` smarter about which names to avoid. Part of reworking PR rolldown#4014 for ~4% performance improvement. Co-Authored-By: Claude Opus 4.5 <[email protected]>
This is the first step towards removing the expensive `rename_non_root_symbol` method. Changes include: 1. Add new fields to Renamer struct: - `entry_module_idx`: Track the entry module index - `entry_module`: Reference to entry module's symbol database - `module_used_names`: Cache of non-root symbol names per module - `used_names`: Track names used during renaming 2. Update `Renamer::new()` to accept `base_module_index` parameter and pre-cache entry module symbol names 3. Add helper methods for the new algorithm: - `is_entry_root_binding()`: Check if name is entry module's root binding - `is_name_available()`: Check if name doesn't conflict with existing names - `get_or_create_module_used_names()`: Lazily compute non-root symbol names These changes prepare for eliminating the expensive nested scope iteration in `rename_non_root_symbol` by making `add_symbol_in_root_scope` smarter about which names to avoid. Part of reworking PR rolldown#4014 for ~4% performance improvement. Co-Authored-By: Claude Opus 4.5 <[email protected]>
This is the first step towards removing the expensive `rename_non_root_symbol` method. Changes include: 1. Add new fields to Renamer struct: - `entry_module_idx`: Track the entry module index - `entry_module`: Reference to entry module's symbol database - `module_used_names`: Cache of non-root symbol names per module - `used_names`: Track names used during renaming 2. Update `Renamer::new()` to accept `base_module_index` parameter and pre-cache entry module symbol names 3. Add helper methods for the new algorithm: - `is_entry_root_binding()`: Check if name is entry module's root binding - `is_name_available()`: Check if name doesn't conflict with existing names - `get_or_create_module_used_names()`: Lazily compute non-root symbol names These changes prepare for eliminating the expensive nested scope iteration in `rename_non_root_symbol` by making `add_symbol_in_root_scope` smarter about which names to avoid. Part of reworking PR rolldown#4014 for ~4% performance improvement. Co-Authored-By: Claude Opus 4.5 <[email protected]>
…ot_symbol` pass (#7867) ## Summary This PR optimizes the symbol renamer by eliminating the expensive `rename_non_root_symbol` pass and fixes a bug that could generate duplicate declarations. ### Background This is a rework of #4014, split into atomic commits for easier review. The original PR was rejected due to too many changes at once. Issue #6586 has since been fixed by #7859, but the performance optimization approach from #4014 still provides significant gains. ### Key Changes **1. Eliminate `rename_non_root_symbol` pass** - During root scope renaming, check against nested scopes upfront via `is_name_available()` - Most nested scope symbols keep their original names - Only handle edge cases (cross-module shadowing, CJS params) in lightweight `register_nested_scope_symbols()` **2. Skip storing original-named nested symbols** - Add `canonical_name_for_or_original()` method that falls back to original name - Only store symbols in `canonical_names` when actually renamed - Reduces HashMap insertions for ~80-90% of nested symbols **3. Fix duplicate declaration bug** - When renaming nested symbols (e.g., `a` → `a$1`), check if the candidate name already exists in the same module - Prevents generating invalid code like `function test(a$1, a$1)` when both `a` and `a$1` exist as parameters ### How It Works **Old algorithm:** 1. Rename all root scope symbols 2. Iterate through ALL nested scope symbols to check for conflicts 3. Rename nested symbols that conflict with renamed root symbols **New algorithm:** 1. During root scope renaming, pick names that don't conflict with nested scopes upfront 2. Most nested scope symbols keep original names (no storage needed) 3. Only rename nested symbols that shadow cross-module or renamed top-level symbols 4. Check for duplicate declarations when generating renamed names ```text ┌─────────────────────────────────────────────────────────────────┐ │ Phase 1: Root Scope Renaming │ ├─────────────────────────────────────────────────────────────────┤ │ for each top-level symbol: │ │ → Check: used_canonical_names (top-level conflicts) │ │ → Check: entry module nested scopes (avoid capture) │ │ → Check: own module nested scopes (avoid capture) │ │ → Pick name that avoids ALL conflicts upfront │ └─────────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────────┐ │ Phase 2: Nested Scope Renaming (lightweight) │ ├─────────────────────────────────────────────────────────────────┤ │ for each nested symbol: │ │ ├─ shadows cross-module top-level? ──→ rename │ │ ├─ shadows renamed top-level? ──→ rename │ │ ├─ is CJS `exports`/`module`? ──→ rename │ │ └─ otherwise ──→ keep original (no storage needed) │ │ │ │ When renaming, check candidate doesn't exist in same module │ │ to prevent duplicate declarations: │ │ │ │ Input: function test(a, a$1) { ... } │ │ where `a` shadows top-level from another module │ │ │ │ Rename `a`: │ │ try `a$1` → exists in module scope → skip │ │ try `a$2` → available → use it ✓ │ │ │ │ Result: function test(a$2, a$1) { ... } │ └─────────────────────────────────────────────────────────────────┘ ``` ### Test Case Added `nested_scope_rename_counter` test to verify the duplicate declaration fix: ```js // other.js export const a = 'from-other'; // module_with_nested.js - has both `a` and `a$1` as parameters export function test(a, a$1) { return [a, a$1]; } // main.js import { a } from './other.js'; import { test } from './module_with_nested.js'; ``` **Before fix**: `function test(a$1, a$1)` - duplicate declarations! **After fix**: `function test(a$2, a$1)` - correctly skips `a$1` since it already exists ### Snapshot Changes The optimization produces slightly different (but equally correct) variable names: - Avoids ugly `$1$1` double-suffix patterns - Names are chosen to avoid nested scope conflicts upfront --------- Co-authored-by: Claude Opus 4.5 <[email protected]> Co-authored-by: Yunfei He <[email protected]>
Description
close: #4045
This PR is to re-implement the renamer logic to get rid of the expensive
rename_non_root_symbolmethod, and can also address the #4045 problem.Whether from a new output or performance perspective, I think this change is the correct direction and worth considering.
How does it work?
This implementation follows a principle that does NOT rename to a name that has been declared, even declared in child scopes. Plus, we regard the entry module as a base module because bundling will bundle every referenced code into the entry module. In other words, we ensure that all entry module names are maintained as-is.
Let's take an example to see how and why:
Before output:
After output:
Let's take a look at the before output first, as you can see, after bundling, the
aina.jshas been renamed toa$1, which conflicts withfunc's parama$1, that lead parama$1has to rename toa$1$1to avoid conflict. And let's look at the after output, theaina.jshas been renamed toa$2becausea$1has been declared in child scopes, as I mentioned principle earlier, so that parama$1doesn't need to be renamed because no conflict here. In this way, we no longer need to call an expensiverename_non_root_symbol.Another key change is that if the
NamedImporthas an alias name, then use it as its new nameBefore output:
After output:
Let's compare the before and the after output. Unlike the before implementation, we follow the principle we've mentioned earlier, we won't rename entry module names. So there is no change in the entry file. Let's see the changed part, the
yiny.jshas been renamed toxbecausexis the alias name ofythey use, so that we don't need to calculate the name for it, just use the alias name!Pefromance
https://codspeed.io/rolldown/rolldown/branches/04-03-refactor_rolldown_simplify_renamer_logic

I guess the main performance improvement comes in which we removed
rename_non_root_symbols.Further improvement
After this PR, the assumption that
canonical_nameswill store the names of all symbols no longer exists. Ifcanonical_namesdoesn't contain that symbol, that means this symbol doesn't need to be renamed. We should return early to avoid unnecessary steps. In this PR, I avoided introducing too many changes, so I changed canonical_name_for method to return the symbol name if it doesn't exist incanonical_names. I guess if we refactor this, that would be a notable performance improvement.