Skip to content

Conversation

@MacMcIrish
Copy link
Contributor

@MacMcIrish MacMcIrish commented Dec 6, 2018

What did you implement:

Closes #5568

How did you implement it:

Instead of passing the complete pattern list into Globby, a complete project file list is loaded into memory and then resolved from the patterns using micromatch. The use of micromatch here is because glob itself uses minimatch, which is why we're seeing such a significant performance difference.

How can we verify it:

Run sls package on a project with a large file count with many dev dependencies. The difference in timing is ~30s compared to ~15 minutes.

Todos:

  • Write tests
  • Write documentation
  • Fix linting errors
  • Make sure code coverage hasn't dropped
  • Provide verification config / commands / resources
  • Enable "Allow edits from maintainers" for this PR
  • Update the messages below

Is this ready for review?: YES
Is it a breaking change?: NO

@dschep
Copy link
Contributor

dschep commented Dec 6, 2018

Awesome @MacMcIrish! Thanks for the PR. Do you have a good package.json to test with or does it happen with pretty much any set of dependencies with yarn?

@MacMcIrish
Copy link
Contributor Author

@dschep of course :)
I'll come up with a small project that will recreate this, and I'll get back to you. As far as I know, this will happen for any project that has a not insignificant amount of both dependencies and devDependencies, so hopefully I can get to it soon.

I've just made a couple more changes to the PR after investigating globby's syntax to make sure there isn't any unexpected changes here (outside of what tests currently exist)

@MacMcIrish MacMcIrish changed the title feat: drastically improved glob matching performance when resolving files from patterns feat: drastically improved dev dependency exclusion performance Dec 6, 2018
@MacMcIrish
Copy link
Contributor Author

MacMcIrish commented Dec 6, 2018

@dschep I've got a snippet here to reproduce, using [email protected] using a group of our public dependencies:

#!/bin/sh
rm -rf testProject
sls create --template aws-nodejs --path testProject 
(cd testProject \
	&& echo 'yarn-offline-mirror "./offline/package"' > .yarnrc \
	&& yarn add [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
    && yarn add --dev @babel/[email protected] \
	    @babel/[email protected] \
	    @babel/[email protected] \
	    @babel/[email protected] \
	    @babel/[email protected] \
	    @babel/[email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    micromatch@^3.1.10 \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	    [email protected] \
	&& echo -e 'package:\n  exclude:\n    - offline/**' >> serverless.yml \
	&& time sls package)

Outputs:
sls package 582.59s user 3.70s system 102% cpu 9:34.26 total

Whereas using my branch outputs:
~/local/path/bin/serverless package 23.99s user 2.21s system 141% cpu 18.511 total

@simlu
Copy link

simlu commented Dec 6, 2018

We would love to see this merged in and released quickly as it really hinders our ability to deploy quickly and clogging up ci. Cheers!

@dschep
Copy link
Contributor

dschep commented Dec 7, 2018

Thanks for the snippet @MacMcIrish. I'll test it tomorrow. We'll release this soon @simlu :shipit:

@dschep dschep self-assigned this Dec 7, 2018
Copy link
Contributor

@dschep dschep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks & works great @MacMcIrish! It doesn't seem we have much test coverage of resolveFilePathsFromPatterns. Would you mind adding some? It's probably best to use os.tmpdir() and set this.serverless.config.servicePath to the created temp dir then fill it with some empty files&dirs. I'm happy to do it, but it might take longer 😉

const path = require('path');
const globby = require('globby');
const _ = require('lodash');
const nanomatch = require('nanomatch');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. @micromatch publishes micromatch, nanomatch picomatch, and anymatch. That's not confusing at all 🙄

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had our fun with glob and *match yesterday :) Globby uses fast glob. Fast glob only uses default wildcard pattern according to their docs. However internally they use micromatch, not nanomatch. So they might actually expose the extended syntax that it provides over nanomatch.

We could start using micromatch instead of nanomatch here. I have to check what fast glob does internally. Maybe that's a test we could be adding. This whole thing is so confusing haha... Would have loved to fix it downstream, but honestly don't have the time right now and this is pressing.

Somewhat unrelated but interesting:

  • Globby seems so calls fast glob for each pattern which results in many file scans. Who wrote that and thought it was performant?! All we did is replicate that functionality in memory.
  • Still not entirely sure what the difference between picomatch and nanomatch really is. No dependencies for picomatch?
  • We also have npm glob which provides its own custom syntax and uses minimatch (which is apparently slow).

@simlu
Copy link

simlu commented Dec 7, 2018

@dschep There is honestly a lot of test coverage already testing this (test were testing globby functionality, now testing our code), which is why we didn't add more tests. I'm confused what else you want tested here? Are you just asking because we didn't add tests in this pr?

@dschep
Copy link
Contributor

dschep commented Dec 7, 2018

Oh, I failed to find them then @simlu, I expected them in lib/plugins/package/lib/packageService.test.js. Mind pointing me to them since you seem to know where they are? :)

@MacMcIrish
Copy link
Contributor Author

@dschep There are a few integration tests that check this in zipService.test.js, namely
'should exclude with globs'
'should re-include files using ! glob pattern'
'should re-include files using include config'
'should throw error if no files are matched'
'#zipFiles()'

Please let me know if that is sufficient :)

Copy link
Contributor

@dschep dschep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. That's more than enough. I should've looked harder.

probably gonna :shipit: sometime next week

@dschep dschep merged commit 2606bb2 into serverless:master Dec 7, 2018
@MacMcIrish
Copy link
Contributor Author

Awesome, cheers!

@john-aws
Copy link

john-aws commented Mar 20, 2019

I'm running an up to date sls version 1.39.1 with node v8.10.0 and Python 3.7.2 and am finding that trivially small Python package builds (e.g. 50KB) are taking 5+ minutes. A debug run of sls suggests that it's the 'Excluding development dependencies...' step that's taking most of the time.

$ sls --version
1.39.1

$ sls deploy function --function users | gnomon
   2.0007s   Serverless: Packaging function: users...
 321.8112s   Serverless: Excluding development dependencies...
   0.2782s   Serverless: Uploading function: users (64.57 KB)...
   0.1624s   Serverless: Successfully deployed function: users
   0.0425s   Serverless: Successfully updated function: users
   0.0004s

     Total   324.2973s

How can I verify that I have this fix? If 1.39.1 has the fix, what else could cause this delay? Thanks.

@rodrigogs
Copy link

Same here

@fagiani
Copy link

fagiani commented Dec 26, 2020

Even using the latest version (by running npm update serverless) as below:

$ sls -v
Framework Core: 2.16.1 (local)
Plugin: 4.3.0
SDK: 2.3.2
Components: 3.4.3

it still takes well over a minute to go through the Excluding development dependencies... even for a simple function that has only a couple of tiny dependencies.

This has worked for me and reduced the packaging time to less than a second:

package:
  excludeDevDependencies: false
  exclude:
    - ./**
    - '!package.json'
    - '!src/**'
    - '!node_modules/<npm-mod-1>/**'
    - '!node_modules/<npm-mod-n>/**'

Hopefully it will help others as a workaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: npm -> yarn resulted in very slow "Excluding dev deps..."

6 participants