Skip to content

Conversation

@Rich-Harris
Copy link
Contributor

Over the weekend I got a chance to scratch an itch I've had for a while. There are some surprising bugs with the current version of Rollup, such as #901, that are rather difficult to fix with the existing codebase. On top of that, our tree-shaking is too coarse (it only operates on top-level statements/declarations) and too conservative (it sees potential effects where there are none).

This PR intends to lay the foundations for much more sophisticated static analysis, which will allow us to fix those bugs (I just checked – it fixes #901) and eventually do a much better job of tree-shaking libraries like Lodash, which have a lot of statements that look like they have effects but don't.

It's basically a complete rewrite – sorry – so I'll try to give a high-level overview. Previously, each Bundle was composed of several Module objects, each of which had an array of Statement objects – top-level statements and declarations. Now, there's no Statement – instead, we analyse the AST nodes themselves.

In order to do that, the AST nodes are 'enhanced' with various properties and methods. Each node's initialise method is called, causing declarations (variable, function or class) to register themselves with the current scope. (Each node is responsible for initialising its children, which means that for example a 'x' !== 'x' ? a() : b() expression can stop a() being initialised.) Later, we bind each node, such that each reference is linked to the declaration it references.

The final step before rendering is to run each node that has effects. (An 'effect' means that it impacts the outside world in some way – e.g. a = 1 only has an effect if a is exported or used elsewhere, but window.a = 1 always has an effect.) This means walking over the AST and marking statements so that they get included in the final bundle. If we walk over an Identifier that references a declaration, we 'activate' the declaration – so if we encounter foo() and Rollup judges that the foo function has effects (if it's unsure, it errs on the side of caution), both foo() and function foo () {...} will be included.

Key to this is the idea that we can track the values of nodes. Because we know that foo references a function declaration, it's easy to know what code gets run when foo() happens. If instead we had var foo; foo = condition ? a : b, we know that there are two possible values for foo – if either a() or b() would have effects, so does foo(). If foo was a global or something imported from an external module, Rollup would assume it has effects.

This all turns out to be less hairy than you might imagine, though I'll readily confess that, as with most large rewrites, the purity of the original idea isn't quite reflected in the implementation. There are some rough edges, some unnecessarily verbose code, and doubtless some bugs. It isn't quite ready for merge just yet – need to test the performance, try it out with some more real-world code, etc (plus there are two failing tests related to detecting when cyclical dependencies are in the wrong order), but if @rollup/collaborators are on board then I'd like to merge it before too long otherwise it will block all other work.


Next steps

Assuming there are no showstoppers here – and I'm reasonably confident with this general approach – phase 2 of the operation will be to add more value-tracking logic. For example, this...

var string = 'x';
string = string.replace( 'x', 'y' );

...doesn't have an effect, but because Rollup sees the call expression and doesn't know that String.prototype.replace is effect-free, it includes both lines in the output regardless of whether string is used. And in this case...

class Foo {
  a () { ... }
  b () { ... }
}

const x = new Foo();
x.a();

...it should be possible to remove the b method if Rollup understands a) properties and b) classes/prototypes.

"plugin:import/warnings"
],
"parserOptions": {
"ecmaVersion": 6,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be seven?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, will change that

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

What about caching? With buble this way is ok because we have incremental build. But rollup will process all hundreds files. I can try home to compare branches.

@Rich-Harris
Copy link
Contributor Author

What about caching?

Right now, ASTs aren't reused between runs, meaning Acorn has to re-parse each module. That should be fixable, and once it is I'd expect the caching to have the same impact on incremental builds as before.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

We need to update acorn

Message:
    TypeError: Class constructor DestructuringErrors cannot be invoked without 'new' in project\src\app\dataServices\faults.srv.js
    at Parser.pp$3.parseMaybeAssign (project\node_modules\rollup\dist\rollup.js:3157:54)

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

Hm.. acorn need to release... So long.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

Let's enable classes transpilling for now. Do same **** for angular2 because they like decorators which requires reflection polyfill which requires transpilling because of the same "do not call class".

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

And now I got

---
5162 :     Node.prototype.bind.call( this, scope );
5163 :   };
5164 :
5165 :   AssignmentExpression.prototype.hasEffects = function hasEffects ( scope ) {
5166 :     const hasEffects = this.isUsedByBundle() || this.right.hasEffects( scope );
                 ^^^^^^^^^^
hasEffects is already declared (5166:8)`

Is that a bug in node or somthing else?

@Rich-Harris
Copy link
Contributor Author

I don't quite follow the class thing, why do we need do transpile them? They're supported natively in Node 4 upwards.

Is that a bug in node or somthing else?

Hmm, possibly, what Node version are you on? A const that shares a name with the ID of the function expression it's inside shouldn't be an error.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

v6.3.1

We can't invoke classes without new keyword. acorn contains DestructuringError.call() which was removed but not released yet.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

Enabled transpilling only for acorn. It works! Finally. Now two times slower.

@Rich-Harris
Copy link
Contributor Author

Rich-Harris commented Sep 6, 2016

Weird, was working fine for me.

Now two times slower

what is two times slower?

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

As you said ast is not cached yet. This is the result. I'm checking on that big project.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

I think we should fix it before merge to not make rollup worse.

@TrySound
Copy link
Member

TrySound commented Sep 6, 2016

Ah, and I mean incremental build. Not regular.

@Rich-Harris
Copy link
Contributor Author

I think we should fix it before merge to not make rollup worse.

Agree. Done 83ccb97

let shouldSeparate = false;
let separator;

if ( this.scope.isModuleScope && !/forStatement/.test( this.parent.type ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be ForStatement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, actually it was supposed to be

const forStatement = /^For(?:Of|In)?Statement/;

// ...
`!forStatement.test(...)

good catch! will fix

@eventualbuddha
Copy link
Contributor

This branch seems to work with decaffeinate and its various projects. However, it does fail to treeshake a helper added by babel, the set helper. The simplest case I can come up with is this:

function set() {
  set();
}

That function is preserved despite its only reference being inside itself.

@Rich-Harris
Copy link
Contributor Author

That function is preserved despite its only reference being inside itself.

ah, of course – if a function calls itself Rollup assumes the worse, and it doesn't actually need to. Fix inbound

@eventualbuddha
Copy link
Contributor

I just tried to verify that your fix caused set not to appear, but it still does. I copy-pasted it into its own file and ran this branch on it and got an exception:

// index.js
var set = function set(object, property, value, receiver) {
  var desc = Object.getOwnPropertyDescriptor(object, property);

  if (desc === undefined) {
    var parent = Object.getPrototypeOf(object);

    if (parent !== null) {
      set(parent, property, value, receiver);
    }
  } else if ("value" in desc && desc.writable) {
    desc.value = value;
  } else {
    var setter = desc.set;

    if (setter !== undefined) {
      setter.call(receiver, value);
    }
  }

  return value;
};
$ node ~/lib/rollup/bin/rollup index.js
Cannot use 'in' operator to search for 'length' in value
TypeError: Cannot use 'in' operator to search for 'length' in value
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:34:19)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:36:4)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
    at enhanceNode (/Users/donovan/lib/rollup/src/ast/enhance.js:59:3)
Type rollup --help for help, or visit https://github.com/rollup/rollup/wiki

It turns out it's trying to enhance the value property of this node:

Node {
  type: 'Literal',
  start: 294,
  end: 301,
  value: 'value',
  raw: '"value"' }

@Rich-Harris
Copy link
Contributor Author

It turns out it's trying to enhance the value property of this node:

Ah, I know why that is. It doesn't have a lookup table of child nodes for each node type, instead it builds that table as it encounters each node – so the first time it encounters a VariableDeclaration, it adds VariableDeclaration: [ 'declarations' ] to keys, allowing it to skip property enumeration the next time it sees one.

If the first Literal it encounters had a non-primitive value, it would mistakenly add value to the list of child keys for Literal nodes. Should be easily fixed by adding Literal: [] to https://github.com/rollup/rollup/blob/rewrite/src/ast/keys.js. Not able to do that right now though, am under the cosh.

As to why set is continuing to appear, that's a deeper mystery – will have to look into that later.

@TrySound
Copy link
Member

TrySound commented Sep 8, 2016

Benchmark

[00:34:14] Starting 'app'...
1628ms: --BUILD--
757ms: transform
151ms: ast
218ms: analyse
97ms: phases234
88ms: treeshake
546ms: --GENERATE--
173ms: render modules
368ms: sourceMap
[00:34:16] Finished 'app' after 2.27 s
[00:34:16] Starting 'app:dev'...
[00:34:16] Finished 'app:dev' after 7.53 ms
[00:34:19] Starting 'app'...
637ms: --BUILD--
0ms: transform
26ms: ast
275ms: analyse
84ms: phases234
80ms: treeshake
623ms: --GENERATE--
165ms: render modules
455ms: sourceMap
[00:34:21] Finished 'app' after 1.39 s
[00:34:30] Starting 'app'...
634ms: --BUILD--
0ms: transform
18ms: ast
264ms: analyse
89ms: phases234
104ms: treeshake
695ms: --GENERATE--
300ms: render modules
391ms: sourceMap
[00:34:31] Finished 'app' after 1.43 s
[00:34:43] Starting 'app'...
537ms: --BUILD--
0ms: transform
8ms: ast
191ms: analyse
88ms: phases234
68ms: treeshake
556ms: --GENERATE--
161ms: render modules
392ms: sourceMap
[00:34:44] Finished 'app' after 1.18 s
[00:34:46] Starting 'app'...
529ms: --BUILD--
0ms: transform
24ms: ast
199ms: analyse
81ms: phases234
65ms: treeshake
552ms: --GENERATE--
159ms: render modules
390ms: sourceMap
[00:34:47] Finished 'app' after 1.17 s
[00:34:48] Starting 'app'...
521ms: --BUILD--
0ms: transform
7ms: ast
184ms: analyse
88ms: phases234
54ms: treeshake
555ms: --GENERATE--
156ms: render modules
396ms: sourceMap
[00:34:49] Finished 'app' after 1.17 s
[00:34:50] Starting 'app'...
597ms: --BUILD--
0ms: transform
13ms: ast
228ms: analyse
104ms: phases234
91ms: treeshake
826ms: --GENERATE--
364ms: render modules
400ms: sourceMap
[00:34:52] Finished 'app' after 1.51 s

src/Module.js Outdated
this.statements = this.parse();

this.declarations = blank();
this.type === 'Module'; // TODO only necessary so that Scope knows this should be treated as a function scope... messy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be ===?

Copy link

@snuggs snuggs Sep 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're strong type checking then 👍 @Victorystick. Or maybe i'm missing the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, yeah – should be this.type = 'Module'. Fixed

@Rich-Harris Rich-Harris merged commit 0c73791 into master Sep 10, 2016
@Rich-Harris
Copy link
Contributor Author

Have released this as 0.35 so that we have a solid base from which to start optimising and addressing any lingering bugs.

@eventualbuddha

I just tried to verify that your fix caused set not to appear, but it still does

I think I have an understanding of why this is happening now. It's because the function expression references both itself and a function with an unknown value (MemberExpression nodes don't currently know what their possible values are – that's something for the near future), so when in the final stage of tree-shaking Rollup asks all the expressions that could potentially have relevant effects (AssignmentExpression, UpdateExpression and CallExpression nodes) whether they do, it gets confused.

The fix I envisage is something I was considering separately as a performance optimisation – only enhancing and initialising nodes as necessary. Ideally, as far as Rollup is concerned, in the final tree-shaking stage that declaration should look like this:

var set = function set ( /* who cares */ ) {
  /* who cares */
};

Unless set is used by something else there's really no need to traverse that deep.

@TrySound

Thanks, that's useful. How are you running those numbers? Wonder if it's something we should have in the library all the time (perhaps behind if ( DEBUG ) ..., if it has a noticeable impact on the thing its measuring...). Seems like sourcemap generation is low hanging fruit, perhaps we can cache some of that stuff more effectively.

@TrySound
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrectly removing code

7 participants