Skip to content

Code Splitting#1841

Merged
lukastaegert merged 13 commits intorelease-0.55.0from
code-splitting-rebased
Jan 21, 2018
Merged

Code Splitting#1841
lukastaegert merged 13 commits intorelease-0.55.0from
code-splitting-rebased

Conversation

@guybedford
Copy link
Copy Markdown
Contributor

@guybedford guybedford commented Jan 5, 2018

Finally got the first draft of this feature working (#372). This is feature complete, passing tests, and ready for review. It deals with the edge cases comprehensively.

This is written as a PR against the dynamic import inlining PR, so I'll update this as that gets reviewed in as well (lets get that one going first too).

The API for this is to take input as an array of entry points:

const bundle = await rollup.rollup({
  input: ['main1.js', 'main2.js', 'feature.js'],
  experimentalCodeSplitting: true,
  experimentalDynamicImport: true
});

Then the bundle returned can be written with a dir argument instead of a file argument:

bundle.write({
  dir: 'dist',
  format: 'cjs'
});

Code splitting supports output formats cjs, es and amd but not iife or umd due to the inability to get good dependency support.

The above would output files: dist/main1.js, dist/main2.js and dist/feature1.js where those files would not contain any dupicated modules between them. Instead, where modules are duplicated, a separate chunk is created with a hash suffix and both modules will load from that. The algorithm to do this is basically a graph colouring algorithm where every entry point is given a unique colour (hash buffer), and we then iterate all reachable modules from that iterate, colouring in the graph. As we iterate the other entry points the colours mix, and at the end, the chunks are the modules that have the same colours.

Detected dynamic imports are treated as "discovered" entry points in this process, thereby sharing the chunking.

There is no registry used in any of this - we simply wire up the import and export bindings, carefully managing execution order consistency and then adapting the finalisers and deshadowing process to deal with the new deshadowing problems between these boundaries.

The wiring gets complex quickly on these cases, so I had to make some architectural changes to the deshadowing and finaliser process, basically:

  • getName () is now simply return this.safeName || this.name on all Variable classes.
  • Bundle now handles working out the module boundaries, which it then passes to the finalisers through getModuleDeclarations(). This entirely abstracts away finalisers from having to think about things like AST Nodes, variables, externalModules etc, and just follow the bundle data returned (separation of concerns win hopefully, although sorry to hack up the finalisers to much!)
  • The deshadowing function now runs on render, as the same variable can have a different deshadowing depending on which bundle it is in. A nice result of this is that unnecessary deshadowing test cases from normal builds could be simplified.
  • The deshadowing function is now called setRenderIdentifiers and includes the previous getName logic, ie setting an identifier safeName to exports.exportName where necssary and everything. We could even consider changing safeName to bundleName, outputName or renderName to make this clearer.

@guybedford guybedford force-pushed the code-splitting-rebased branch from 807410d to 401f259 Compare January 6, 2018 12:50
@guybedford
Copy link
Copy Markdown
Contributor Author

I've finished padding out the tests now including full dynamic import chunking cases for CJS and AMD output being supported.

@guybedford guybedford force-pushed the code-splitting-rebased branch from 7621017 to 92afe1b Compare January 6, 2018 21:18
@guybedford guybedford force-pushed the dynamic-import-inlining branch from ad69de0 to 5626db6 Compare January 8, 2018 11:35
@guybedford guybedford force-pushed the code-splitting-rebased branch 3 times, most recently from 4717559 to df6dec2 Compare January 8, 2018 20:27
@jthoms1
Copy link
Copy Markdown

jthoms1 commented Jan 8, 2018

We would be very interested in using this within StencilJS. I have been working on a way to accomplish this using a custom tool to identify bundling use cases based on https://github.com/samthor/srcgraph. It appears that this PR will mitigate much of the issues I have had.

@guybedford guybedford force-pushed the dynamic-import-inlining branch from 1e03702 to afe6288 Compare January 9, 2018 09:24
@guybedford guybedford force-pushed the code-splitting-rebased branch 2 times, most recently from ed46590 to abad2be Compare January 9, 2018 10:56
@TheLarkInn
Copy link
Copy Markdown

This is killer work @guybedford. If you need any help with approaching use cases that we've already covered let me know if we can lend a hand.

@Rich-Harris
Copy link
Copy Markdown
Contributor

I won't pretend to have fully read and understood this, #1816 and #1840, but I've been giving it a little poke tonight and it really is something. Bloody amazing work!

@Munter
Copy link
Copy Markdown

Munter commented Jan 10, 2018

Is it possible to add a sort of manifest file to the output? This would let consumers implement preloading in browsers of parsed she converted to http headers or H2 push

@guybedford
Copy link
Copy Markdown
Contributor Author

guybedford commented Jan 10, 2018

@Munter the chunks property provides this information:

bundle.chunks: {
  [chunkName: string]: {
    name: string, // same as chunkName above
    imports: string[],
    exports: string[],
    modules: string[]
  }
}

Reading the imports correspond to either external modules or other chunkName values which would allow extracting the dependency graph for preloading.

return Promise.all(promises).then(() => {
return mapSequence(
graph.plugins.filter(plugin => plugin.onwrite), (plugin: Plugin) =>
Promise.resolve(plugin.onwrite(assign({ bundle: chunk }, outputOptions), chunk))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should onwrite also receive the chunkName as a parameter somewhere? Previously in "single bundle mode", onwrite could figure out what files were written using the outputOptions, but now with chunks, I think it needs the chunkName to be able to re-construct the written file paths.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chunk object above actually does include the name as well, so I believe that would be passed to the plugins ok?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, must've misread it. Sounds good!

@guybedford guybedford force-pushed the dynamic-import-inlining branch from afe6288 to ba10f04 Compare January 11, 2018 12:32
@guybedford guybedford force-pushed the code-splitting-rebased branch from abad2be to e7c2c80 Compare January 11, 2018 12:39
});
}

if (!codeSplitting) return graph.buildSingle(inputOptions.input)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the codeSplitting flag is true then the TS types returned by generate will be different. Currently TS believes that the same types are returned whether this is set or not. I am not exactly sure the right approach to resolve this rather than setting an OR statement in the OutputBundle's generate result.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are basically two overloads of the generic rollup function from a typing perspective:

interface SingleInputOptions extends InputOptions {
  input: string;
}
interface MultipleInputOptions extends InputOptions {
  input: string[];
}
function rollup(options: SingleInputOptions): Promise<BundleSingle>;
function rollup(options: MultipleInputOptions): Promise<BundleChunks>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool this would work as well.

@guybedford guybedford force-pushed the code-splitting-rebased branch from e7c2c80 to 605b12c Compare January 11, 2018 17:46
@guybedford guybedford force-pushed the code-splitting-rebased branch from 605b12c to f76db6b Compare January 12, 2018 11:14
@guybedford guybedford changed the base branch from dynamic-import-inlining to master January 12, 2018 11:14
@jthoms1
Copy link
Copy Markdown

jthoms1 commented Jan 12, 2018

I am using this branch in a local version of StencilJS to build out Ionic. In my testing so far I have not ran into any issues. The chunking logic works exactly as expected.

Copy link
Copy Markdown
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the first part of the review where I checked functionality and looked at all the tests. There are already some really impressive things you can do here!

I find the behaviour of the entrypoint-facade test a little confusing - why is the default export moved from main2 to the shared chunk? But the result is certainly correct.

I fear I will need to continue my review tomorrow with the main logic.

@@ -0,0 +1,6 @@
module.exports = {
description: 'simple chunking',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better describe purpose of this specific test?

@@ -0,0 +1,6 @@
module.exports = {
description: 'simple chunking',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better describe purpose of this specific test?

@@ -0,0 +1,6 @@
module.exports = {
description: 'simple chunking',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better describe purpose of this specific test?

@@ -0,0 +1,6 @@
module.exports = {
description: 'simple chunking',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better describe purpose of this specific test?

@@ -0,0 +1,6 @@
module.exports = {
description: 'simple chunking',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better describe purpose of this specific test?

return num + multiplier;
}

export { mult };
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I assume the point of this test is to show that the unused imports from dep2.js are properly removed. But maybe we can do better by not emitting this chunk at all? But this might also be a possible future enhancement.

For instance the logic could be that a chunk is not emitted when

  • There are no synchronous imports from it (otherwise we should probably include it for potential side-effects until we found away to improve the side-effect detection logic to find out if a chunk has side-effects in its initialization code. Actually, this could probably be done by checking if any of the top-level statements are included or if one of the imports has side-effects. But again, maybe a future enhancement)
  • None of the dynamic imports are included

Another interesting fact I stumbled upon that we might improve in the future is that when

  • a chunk exports a variable and
  • this chunk is imported by two other chunks both of which import this variable and
  • only one of the chunks actually uses this variable,
    the imported variable is retained for both chunks. The reason is that there is only one variable entity and as soon as it is included, all imports of this variable are included. If none of the chunks use the variable, it is properly removed. I think it is more of a cosmetic issue but still.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've exactly understood the two limitations of the approach I put together - the chunking algorithm is run on the graph before treeshaking, and the export boundaries between chunks aren't properly pruned. I originally aimed to handle both of these cases (and my first few attempts were in this direction), but hit some road blocks.

When running chunking on the graph after treeshaking, execution order still has to be maintained as an invariant. In addition we need to work out which chunks can be considered pure in order to skip emission like you mention.

If I remember correctly, the main issue I hit when implementing this (after fixing up export tracing etc etc) was that the concept of "marked" or "used" must then be considered uniquely for each specific chunk, which wasn't a simple capability to add. I can't remember what the other issues were, but if you're interested I could try trawling through my notes for future reference.

Copy link
Copy Markdown
Contributor Author

@guybedford guybedford Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in many of my practical investigations of the algorithm, these assumptions ended up not seeming so bad at all though. The majority of cases work really well without these features. I think as we use this feature we'll get a better idea for how bad "pure" chunk relations and unpruned boundaries might end up looking, and can either consider specific fixes, or taking the more general route then.

options: {
input: ['main1.js', 'main2.js']
}
};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess indeed the proper term should be facade without the "s" in both the description and folder name.

@guybedford
Copy link
Copy Markdown
Contributor Author

guybedford commented Jan 15, 2018

Thanks for the initial review. I've made the changes and fixed up the namespace test.

To explain the entrypoint facade concept the point is that when an entry point module is contained in a chunk, where that chunk has other exports as well, those other exports would effectively pollute the entry point module exports if we treated that chunk as the entry point itself.

So the way we handle this is to first imagine all entry points as being "facades" to their actual exports in whatever chunk their actual code happens to be in. Those export names might then be changed and bundled with other exports, but the facade module allows us to just reexport the right interface.

In the case where a chunk contains an entry point, and no name changes are necessary and no other exports are provided from the chunk we then make the optimization that the entrypoint facade is no longer necessary as the chunk matches the facade. So then we can just treat the chunk as the facade (as happens in most of the tests).

Let me know if there is anything that is still unclear on that.

@thelgevold
Copy link
Copy Markdown

I gave the code in this branch a try with a mid size Angular application.
It seems to work really well.

In my setup I used Rollup code splitting to generate lazy modules. Next I implemented lazy loading in Angular using the router, aided by systemjs url mappings.

Here is a summary of my findings: http://www.syntaxsuccess.com/viewarticle/lazy-loading-with-rollup.

Here is the config: https://github.com/thelgevold/angular-2-samples/blob/master/rollup/rollup-lazy-config.js

@lukastaegert
Copy link
Copy Markdown
Member

Nice!

Copy link
Copy Markdown
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guybedford yes, I initially had the misconception that the case you described could never happen. But of course you are right: if A and B are entry points, A imports B, and both A and B import D, then there will be two chunks: the A chunk and the BD chunk. And of course as both are directly imported, BD needs to have the exports of both B and D.

I was especially curious about this as I was wondering if code-splitting code not be used/extended to solve @kellyselden's ember-cli problem, cf. #1878. My thought was that maybe if the algorithm regarded each module as an entry point and was told to not bundle chunks, the only issue would be to provide proper output paths, especially for bundled external dependencies.

As nonsense as this idea may sound at first—i.e. running rollup without bundling—I see it becoming more and more relevant as we are further improving the tree-shaking and adding advanced code transformations that others may want to profit from without the need for bundling. Just some food for thought.

if (command._.length === 1) {
command.input = command._;
} else if (command._.length === 1) {
if (command.input) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this could be simplified to only handle the error once and merge conditions, i.e. handle the error if command._.length >= 1 and then do the rest in a nested if clause.

src/Graph.ts Outdated
import path from 'path';
import GlobalScope from './ast/scopes/GlobalScope';
import xor from 'buffer-xor';
import * as crypto from 'crypto';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When building rollup, these imports create a lot of noise and especially the path, buffer-xor and crypto imports break browser compatibility (which is really important for the REPL to work; note that we cannot have unresolved dependencies for the browser build)!

You will also notice there is another import of path related functionality from ./utils/path further up. In a browser build, this import is replaced via an ad-hoc plugin in /rollup.config.js with a polyfill in /browser/path.ts. I guess a similar route should be taken for the other path-related methods or otherwise we should consider switching to rollup-plugin-node-builtins which promises to also solve the buffer related problems that npm run build lists.

For crypto, maybe a browser-compatible drop-in can be found that we might swap in similar to the path approach? Otherwise, as you are basically only using the hashing and the xor, maybe a more specialised library can be found?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that, maybe we need a test for the browser build that at least checks for externals or unresolved dependencies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will see what I can do.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've replaced it with native functions here and gone with numbered chunks. Hashing could be useful, but it should at least be proper hashing of the source that can cover all file names in the build as a separate feature.

});
}

buildChunks (entryModuleIds: string[]): Promise<{ [name: string]: Chunk }> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of overlap between buildSingle and this function that should probably be extracted if we want to keep both functions; though of course it would be even better if buildChunks could just handle the buildSingle case as well if we slightly modify it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this function is really long. It think brain-overload-wise, but also with regard to separation of concerns, it would benefit if the final chunk preparation could be extracted. Doing this first might also help us to determine more easily if the non-chunking case could become a special case of the chunking version. But maybe this does not make sense as you have obviously put much more thought into this. Still, some splitting up of this >120 lines method certainly would not hurt maintainability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At it's made an improvement on the buildSingle function though from where it was before as the main build method.

I'm still weary of over-abstracting though, it can be just as much or even more of a maintenance issue having to jump around a file to work out exactly what a method is doing. Seeing the low-level details in one place is a powerful way to understand something even if it does mean you have to stare at it for longer than 5 minutes :P

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the idea is the code comments provide the high-levels... reading through the comments gives an algorithmic overview, better than some clever method names might.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First a promise as this is going to be a little longer (but only this one time): This is the last time for this review that I will try to make a point that smaller functions are a gods-end for maintainability and I will happily merge this PR without any changes here!

I also want to stress that I have the deepest respect for you as well as the tremendous work you did here. Nevertheless, I like to hold it like this tweet and hope others do towards me and my code as well: https://twitter.com/wolframkriesing/status/951375642876145664 Especially I want to obey commandment 8 in the given link, as hard as it feels.

That being said, I really do hope to eventually be able to convince you about the value of smaller, well-named functions. Maybe not for this PR but for PRs to come.

Code is read many more times than it is written

When I develop new functionality, I too find it easier to write a monolithic function first because it enables me to see everything together, find parallels, etc. But this is the initial development and I am the person who developed the code so I really know my way around. But other people will need to read and understand my code. And for each of them, it will be at time and again these 5 Minutes of staring at the code. Only to find out that most of it is irrelevant for their problem. And this is not just others but my future self as well after a few months. Estimates are of a ratio of 1-to-10 writing vs. reading, but it surely depends on the code. I would expect this rate to be even higher for such a high-profile OSS project.

This is an excellently sums it up. If you only want to read one thing, reads this: http://sam-koblenski.blogspot.de/2014/01/functions-should-be-short-and-sweet-but.html

The important argument for readability here is the "drilling down" aspect that allows you to get to the relevant parts quickly. Modern IDEs let you jump to function and method definitions with a single click, and if you really need to see it in context, inlining a function or method is usually just a keyboard shortcut away. As well as extracting it.

Long functions make people wary of refactorings

It is much harder in a long function to switch out part of the functionality. Because variables have long scopes and there are interdependencies between different parts of the function that make people wary. So instead of just switching out the part that needs switching, people start piling up additional functionality. Because they always need to understand the whole function to be able to change the relevant part and this costs time and greatly increases cognitive overload. See here: https://www.sandimetz.com/blog/2017/6/1/the-half-life-of-code

Code that isn't easy to replace doesn't get replaced, instead it gets expanded

Comments are a bad way of documenting code

This is thoroughly described here: https://visualstudiomagazine.com/articles/2013/06/01/roc-rocks.aspx

But the real problem is that, even if that miracle of accuracy and completeness were achieved, there's no way for someone reading the comments to know that accuracy and completeness have been achieved

A much shorter to-the-point summary of how to write self-documenting code (which is what I strongly advocate!) can be found here: https://gist.github.com/tharumax/a78883d198217ec3cdaf

Maintainability is key for rollup to be ahead of the competition

I have spent several years coding in tightly integrated teams with highly talented and experienced colleagues, constant code reviews and high enforced standards of software craftsmanship. What I am proposing is not solely my personal feeling but also the result of countless years of experience of many software developers on how to keep development speed high and bugs short-lived when coding in a team as the complexity of an application grows. Even though not strictly about JavaScript, Clean Code is still one of the most comprehensive books about the issue, listing countless examples and counter-examples as well as more background to understand the value these concepts have for efficient software development. And I have myself seen over and over that even the most convinced proponents of longer functions at some point accepted the wisdom of better abstraction.
Admittedly, good function names are key when extracting functionality. But in contrast to comments, function names can easily be adjusted via your IDE (and they should be). And a reviewer will always be happy to point out where a name is mis-leading so that a better one can be found.

But now I want to be silent, finish the review and obey commandment 8.

src/Graph.ts Outdated
} = {};

const entryChunkNames: string[] = [];
function generateUniqueEntryPointChunkName (id: string): string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you provide the entryChunkNames as a parameter, this function could be extracted out of this method (and even out of this class), which if feel could help readability.

src/Graph.ts Outdated
// one more xor for circular entry points to ensure distinction
// to ensure one entry point per chunk property
if (module.isEntryPoint)
module.entryPointsHash = xor(module.entryPointsHash, curEntryHash);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is an if-condition missing? Because if you have this situation:

  • A is the only entry point
  • A imports B
  • B imports A

then two separate chunks will be created for no reason at all since B could easily be included into the common AB chunk. Maybe you want this second xor to only happen if the module is different from the module with which you started the current iteration (e.g. by providing it via a second parameter)? Or am I missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I discovered an issue that may or may not be related to this for which I will create a test PR to illustrate it where rollup is not properly bundling an entry point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch - yes we must avoid xoring with itself!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok changed the algorithm slightly :)

return;
allSeen[module.id] = true;

module.execIndex = ordered.length;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea with the execution order!

@lukastaegert
Copy link
Copy Markdown
Member

This was the main part of the review, everything that is missing are the finalisers + some small files here and there. I'm confident that I will conclude this tomorrow and looking forward to releasing this! Great stuff! Note that we definitely need to find a solution for the browser bundle first, though, see my comments.

@guybedford guybedford force-pushed the code-splitting-rebased branch from 5d5b4ed to 4dca6bd Compare January 18, 2018 23:29
@guybedford guybedford force-pushed the code-splitting-rebased branch from c7f0d3d to e148229 Compare January 19, 2018 00:33
Copy link
Copy Markdown
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, all reviewed! One last point (see one of the comments) is if we shouldn't make the transition from non-chunking to chunking smoother. Might also enable us to remove some duplicate code. But might also be done once we move this out of "experimental".

legacy?: boolean;
watch?: WatcherOptions;
experimentalDynamicImport?: boolean;
experimentalCodeSplitting?: boolean;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it in its current implementation, code splitting is a different mode of operation for rollup as with code-splitting

  • the dir property is required (even if you only specify one entry point)
  • the file property is ignored in favour of the original file name (+ possible deconflicting)

If we want to keep it like that, I would suggest to remove the "experimental" from the flag. However I would actually prefer to not have a second mode of operation (and keep the "experimental" in the first iteration). What if instead it worked like this:

  • If you only provide one entry point (either like "input": "./main.js" or "input": ["./main.js"]), then
    • If you only provide "output.file" but not "output.dir", this file is used as the bundle file as in this situation, there will always be a single output file (or am I wrong?)
    • If you only provide "output.dir" but not "output.file", the same naming scheme as in the chunking version of the algorithm will be used
    • If you provide both, a warning is displayed that you cannot provide both "output.file" and "output.dir" and the latter will be ignored (or the other way around, I am not partial here)
  • If you provide several entry points and "output.dir" is provided, rollup will run the chunking version of the algorithm and use the original file names as a basis for the generated ones
    • If you additionally provide "output.file", a warning is displayed that "output.file" will be ignored for multiple entry points
    • If you do not provide "output.dir", an error is thrown (as is currently the case, but only if you actually have multiple entry points)

What do you think? Or do you think this is too complicated for the first iteration?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference here is that dynamic imports are inlined into one file for the single-file output, while dynamic imports can generate chunks for the directory output.

So I think it's perhaps clearer to consider input[], dir and input, file as completely distinct cases. We could consider allowing input as a string for the dir case possibly.

expectedCode = 'missing file';
}

/*try {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have a test with actual sourcemaps but if course it is kind of hard to verify they are correct. Not sure about this, commented code tends to become obsolete quickly. Maybe we check sourcemaps if they are present in the _expected folder and just add them to the basic test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got this going :)

src/Graph.ts Outdated

export type ResolveDynamicImportHandler = (specifier: string | Node, parentId: string) => Promise<string | void>;

function generateChunkName (id: string, curEntryChunkNames: string[], startAtTwo = false): string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the code below does not seem to depend on curEntryChunkNames being an array, maybe we should replace it with a Set (or a blank() object) to slightly improve lookup performance when there are many chunks (i.e. Set.has instead of Array.indexOf)?

src/Graph.ts Outdated
while (curEntryChunkNames.indexOf(uniqueEntryName) !== -1)
uniqueEntryName = entryName + uniqueIndex++;
curEntryChunkNames.push(uniqueEntryName);
return uniqueEntryName + ext;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. I love the new chunk names! Looks much better!

}
});
private warnCycle (_entryModule: Module, _ordered: Module[]) {
// TODO: reinstate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return result;
});

return graph.buildChunks(inputOptions.input)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hope is that if we can manage to make code-splitting always on (as outlined above), we could basically remove the first version of generate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a middle ground for always-on could be that one input and an output file would output chunks in the same folder as the output file, possibly separating the dynamic import inlining option as inlineDynamicImports: true or similar?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, I was forgetting about this. I guess a flag would definitely make sense as otherwise the behaviour would be inconsistent.

Copy link
Copy Markdown
Member

@lukastaegert lukastaegert Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe:

  • If we have a single entry point
    • if output.file is provided, it is the name of the chunk generated for the entry point
    • if not but we have output.dir, we use the new naming algorithm
    • if neither is provided, we show an error that either is missing
    • if a dynamic import is encountered + inlineDynamicImports: true, it is of course inlined
    • if a dynamic import is encountered + inlineDynamicImports: false
      • if output.dir is provided, additional generated chunks are placed there
      • otherwise, an error is shown
  • If we have multiple entry points
    • if output.dir is missing, we show an error
    • if output.file is provided, we show a warning that it is ignored
      ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the experimentalCodeSplitting flag isn't really behaving like an "experimental" flag here since it's guiding the whole process. I've made the following change for now:

  • If input is a string, do a single file build, assuming file and warning otherwise
  • If input is an array and the flag is enabled, do a code splitting build, assuming dir and warning otherwise

We can potentially look at different arrangements of this before flag removal, but at least the flag itself is not part of the final decision now.

@guybedford guybedford force-pushed the code-splitting-rebased branch from b841276 to 6ab004d Compare January 19, 2018 17:31
Copy link
Copy Markdown
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Gonna release this Sunday or Monday.

while (chunkNames[uniqueName])
uniqueName = name + uniqueIndex++;
chunkNames[uniqueName] = true;
return uniqueName + ext;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, also like the stream-lined names!

Copy link
Copy Markdown
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could catch, must have totally overlooked this 👍
Also like the new way the feature flag works now.

@lukastaegert lukastaegert added this to the 0.55.0 milestone Jan 21, 2018
@lukastaegert lukastaegert changed the base branch from master to release-0.55.0 January 21, 2018 19:55
@lukastaegert
Copy link
Copy Markdown
Member

I will put this into a release branch first which I will merge into master once we are sure everything is there that should be there :)

@lukastaegert lukastaegert merged commit 6213e9a into release-0.55.0 Jan 21, 2018
@guybedford guybedford deleted the code-splitting-rebased branch January 22, 2018 12:38
@nnmrts
Copy link
Copy Markdown

nnmrts commented Feb 7, 2018

This was a great story to read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants