fix(minifier): re-anchor leading comments when stripping a ParenthesizedExpression alternate#22710
Closed
Dunqing wants to merge 2 commits into
Closed
fix(minifier): re-anchor leading comments when stripping a ParenthesizedExpression alternate#22710Dunqing wants to merge 2 commits into
Dunqing wants to merge 2 commits into
Conversation
…zedExpression alternate
A single DCE pass produced output where a second pass *dropped* the
explanatory comment on a conditional-expression alternate. Verified
reproduction (one of 8 in monitor-oxc's npm corpus):
export const x = foo ? bar : (
// explanatory comment
(a, b) => a + b
);
// First DCE pass — comment preserved between `:` and the arrow:
export const x = foo ? bar :
// explanatory comment
((a, b) => a + b);
// Second DCE pass on the first pass's output — comment dropped:
export const x = foo ? bar : ((a, b) => a + b);
## Root cause
The parser anchors a leading comment at the *next token's start*
(`crates/oxc_parser/src/lexer/trivia_builder.rs`). Between the two
parses the next token after the comment shifts:
Pass 1: `: ( /*c*/ (a, b) => …` → `attached_to` = inner `(` of arrow params
Pass 2: `: \n/*c*/\n ((a, b) => …` → `attached_to` = outer `(` of parens
`Normalize::exit_expression` unconditionally strips the
`ParenthesizedExpression` (codegen re-emits the wrapping parens via the
inner arrow's `pife: true` flag, so the strip is normally invisible).
After the pass-2 strip the comment is keyed at a position that no AST
node starts at; codegen's `ConditionalExpression` alternate uses an
exact-match `get_comments(alternate.span.start)` lookup and finds
nothing.
## Fix
When `Normalize` strips a `ParenthesizedExpression` whose `(` anchors a
*normal* (non-jsdoc / non-annotation / non-legal) comment *and* the paren
sits in a `ConditionalExpressionAlternate` slot, record a (paren_start →
inner_start) re-anchoring in the visitor and apply it to
`program.comments` in `build`. The remap resolves transitively so
nested `((x))` strips both parens.
The scope is intentionally narrow:
- **Normal comments only.** JSDoc type casts (`/** @type {*} */ (x)`)
and annotation comments (`/* @__PURE__ */`) are tied to the paren
position by tooling that processes them; legal comments
(`@license` / `@preserve`) have their own statement-level emission
pipeline. Re-anchoring those silently shifts verbatim text from a
statement boundary into a call's callee, breaking other idempotency
tests.
- **`ConditionalExpressionAlternate` parent only.** When the paren is
a direct child of `ExpressionStatement` / `VariableDeclarator` /
similar, the statement-level `print_comments_at(span.start)` already
pulls the comment via the parent's own start (which coincides with
the paren start), so re-anchoring there would *lose* the comment.
Verified failure mode on top-of-file legal-style line comments
before IIFEs in d3-min.js. The conditional-expression alternate is
the one expression-context site with its own `get_comments` lookup.
## Impact
End-to-end via monitor-oxc DCE runner on the 116,429-file npm-top-3000
corpus: 9 → 1 idempotency failures. The remaining 1 is an unrelated
unused-variable fixed-point bug (`var __webpack_exports__ = {};` in
`@jest/globals/build/index.js`).
`cargo test -p oxc_minifier` → 505 passed (was 504; +1 new test), 0 failed.
`just minsize` → snapshot unchanged across all 12 fixtures.
`just allocs` → small bump from one `FxHashSet` allocation per
`Normalize` run plus an `O(comments)` scan.
Tests:
- `dead_code_elimination::comment_preserved_when_paren_around_arrow_alternate_is_stripped`
Merging this PR will not alter performance
Comparing Footnotes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A single DCE pass produced output where a second pass dropped the explanatory comment on a conditional-expression alternate. Verified reproduction (one of 8 in monitor-oxc's npm corpus):
Root cause
The parser anchors a leading comment at the next token's start (
crates/oxc_parser/src/lexer/trivia_builder.rs). Between the two parses the next token after the comment shifts:comment.attached_to: ( /*c*/ (a, b) => …(of arrow paramsArrowFn.span.start: \n/*c*/\n ((a, b) => …(of parensParenthesizedExpression.span.startNormalize::exit_expressionunconditionally strips theParenthesizedExpression(codegen re-emits the wrapping parens via the inner arrow'spife: trueflag, so the strip is normally invisible). After the pass-2 strip the comment is keyed at a position that no AST node starts at; codegen'sConditionalExpressionalternate uses an exact-matchget_comments(alternate.span.start)lookup and finds nothing.Fix
When
Normalizestrips aParenthesizedExpressionwhose(anchors a normal (non-jsdoc / non-annotation / non-legal) comment and the paren sits in aConditionalExpressionAlternateslot, record a (paren_start → inner_start) re-anchoring in the visitor and apply it toprogram.commentsinbuild. The remap resolves transitively so nested((x))strips both parens.The scope is intentionally narrow:
Normal comments only. JSDoc type casts (
/** @type {*} */ (x)) and annotation comments (/* @__PURE__ */) are tied to the paren position by tooling that processes them; legal comments (@license/@preserve) have their own statement-level emission pipeline. Re-anchoring those silently shifts verbatim text from a statement boundary into a call's callee, breaking other idempotency tests.ConditionalExpressionAlternateparent only. When the paren is a direct child ofExpressionStatement/VariableDeclarator/ similar, the statement-levelprint_comments_at(span.start)already pulls the comment via the parent's own start (which coincides with the paren start), so re-anchoring there would lose the comment. Verified failure mode on top-of-file legal-style line comments before IIFEs in d3-min.js. The conditional-expression alternate is the one expression-context site with its ownget_commentslookup.Impact
End-to-end via monitor-oxc DCE runner on the 116,429-file npm-top-3000 corpus: 9 → 1 idempotency failures. The remaining 1 is an unrelated unused-variable fixed-point bug (
var __webpack_exports__ = {};in@jest/globals/build/index.js).cargo test -p oxc_minifier→ 505 passed (was 504; +1 new test), 0 failedjust minsize→ snapshot unchanged across all 12 fixturesjust allocs→ small bump from oneFxHashSetallocation perNormalizerun plus anO(comments)scanTests:
dead_code_elimination::comment_preserved_when_paren_around_arrow_alternate_is_stripped