Skip to content

Extract yarn install into a separate step#1673

Merged
gerhard merged 2 commits intomainfrom
extract-yarn-install-into-a-separate-step
Mar 4, 2022
Merged

Extract yarn install into a separate step#1673
gerhard merged 2 commits intomainfrom
extract-yarn-install-into-a-separate-step

Conversation

@gerhard
Copy link
Copy Markdown
Contributor

@gerhard gerhard commented Mar 1, 2022

If multiple yarn.#Run commands run in parallel, they will corrupt each other's yarn cache mount. Because we extract yarn install into a separate step, LLB will dedup the yarn install step and only run it once, regardless how many yarn.#Run commands run in parallel.

Fixes #1670

@gerhard gerhard requested a review from aluzzardi March 1, 2022 19:26
@gerhard

This comment was marked as outdated.

@gerhard gerhard force-pushed the extract-yarn-install-into-a-separate-step branch from 30e2c36 to aba4284 Compare March 1, 2022 19:33
@aluzzardi
Copy link
Copy Markdown
Contributor

This is proper "Dockerfile" hygiene -- this makes yarn install its own "line" so it gets cached/deduped independently (e.g. RUN yarn install + RUN yarn build).

Before this, both were bundled in the same step (e.g. RUN yarn install && yarn build) which was sub-optimal for cache, especially since yarn install is very expensive.

However, by switching to docker.#Build, the container pattern isn't working anymore.

/cc @shykes

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 3, 2022

@gerhard
Copy link
Copy Markdown
Contributor Author

gerhard commented Mar 3, 2022

I have shared all context with @grouville, he is helping out with this.

In the meantime, in the interest of making progress, I have switched focus to doc tasks which do not have external dependencies.

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 3, 2022

Isn't the cache corruption caused by incorrect concurrency settings in the cache mount? And if so, wouldn't setting concurrency: "locked" solve it?

Note: I'm pretty sure I asked this question before and @aluzzardi answered it, but I don't remember the answer 😭 so might as well have it here in writing for the record.

@aluzzardi
Copy link
Copy Markdown
Contributor

Isn't the cache corruption caused by incorrect concurrency settings in the cache mount? And if so, wouldn't setting concurrency: "locked" solve it?

Yes, and ... it would also make sense to have install and run into their own steps.

Locking the cache would make yarn instructions sequential:

  • first yarn install && yarn run test
  • then yarn install && yarn run build

By being their own step:

  • [yarn install, yarn run test]
  • [yarn install, yarn run build]

test and build will wait for install to complete, then run in parallel.

This is very much the same we'd do in a Dockerfile for layer/caching optimization. In addition to parallelism, we gain on cache hits (e.g. changing writeEnvFile, buildDir etc will not re-execute yarn install. Install only runs when src changes)

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 3, 2022

Isn't the cache corruption caused by incorrect concurrency settings in the cache mount? And if so, wouldn't setting concurrency: "locked" solve it?

Yes, and ... it would also make sense to have install and run into their own steps.

Locking the cache would make yarn instructions sequential:

  • first yarn install && yarn run test
  • then yarn install && yarn run build

By being their own step:

  • [yarn install, yarn run test]
  • [yarn install, yarn run build]

test and build will wait for install to complete, then run in parallel.

This is very much the same we'd do in a Dockerfile for layer/caching optimization. In addition to parallelism, we gain on cache hits (e.g. changing writeEnvFile, buildDir etc will not re-execute yarn install. Install only runs when src changes)

OK, so if I understand correctly, we have multiple constraints to deal with:

  1. Don't corrupt the cache. So don't run multiple yarn install concurrently because that corrupts the cache (original problem)
  2. Maximize parallelization, for speed. So don't run yarn install && yarn run within a single exec call, because that removes opportunity to parallelize run steps, which makes execution slower in some cases.
  3. Keep ability to customize the container. (currently not solved).
  4. Maximize cache hits, for speed.

@aluzzardi
Copy link
Copy Markdown
Contributor

OK, so if I understand correctly, we have multiple constraints to deal with

Correct.

To give an example of what optimizations we're talking about, using our own docs website as a reference, yarn install takes about 20 seconds and yarn build another 15 seconds.

Doing yarn install && yarn build would require 35 seconds to complete before yarn test begins. With parallelism, yarn test and yarn build would start at the same time

to parallelize run steps, which makes execution slower in some cases.

To illustrate the "some cases", the cache is shared among ALL yarn instances (by default, unless passing a custom name).

For instance, taking our own repo as an example, when building website and docs we'd have to wait for website to complete before starting to build docs if using a locked cache.

In other words, there could only be a single yarn command running at once for the entire plan.

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 3, 2022

OK, so if I understand correctly, we have multiple constraints to deal with:

  1. Don't corrupt the cache. So don't run multiple yarn install concurrently because that corrupts the cache (original problem)
  2. Maximize parallelization, for speed. So don't run yarn install && yarn run within a single exec call, because that removes opportunity to parallelize run steps, which makes execution slower in some cases.
  3. Keep ability to customize the container. (currently not solved).
  4. Maximize cache hits, for speed.

With these constraints in mind: currently we meet 1, 2, and 4... but not 3 (customize containers).

Perhaps we should move the docker.#Build part to the caller, and expose yarn.#Run and yarn.#Install as distinct low-level actions, for the caller to compose any way they want.

  • Downside: more configuration in the top-level config: you have to invoke yarn.#Install, you have to either invoke docker.#Build or wire multiple yarn calls together manually.
  • Upside: we meet all 4 constraints
  • Upside: less hidden docker.#Build means less chances of triggering the dreaded Nested docker.#Build fails #1466 .

wdyt?

@grouville
Copy link
Copy Markdown
Member

grouville commented Mar 3, 2022

OK, so if I understand correctly, we have multiple constraints to deal with:

  1. Don't corrupt the cache. So don't run multiple yarn install concurrently because that corrupts the cache (original problem)
  2. Maximize parallelization, for speed. So don't run yarn install && yarn run within a single exec call, because that removes opportunity to parallelize run steps, which makes execution slower in some cases.
  3. Keep ability to customize the container. (currently not solved).
  4. Maximize cache hits, for speed.

With these constraints in mind: currently we meet 1, 2, and 4... but not 3 (customize containers).

Perhaps we should move the docker.#Build part to the caller, and expose yarn.#Run and yarn.#Install as distinct low-level actions, for the caller to compose any way they want.

  • Downside: more configuration in the top-level config: you have to invoke yarn.#Install, you have to either invoke docker.#Build or wire multiple yarn calls together manually.
  • Upside: we meet all 4 constraints
  • Upside: less hidden docker.#Build means less chances of triggering the dreaded Nested docker.#Build fails #1466 .

wdyt?

	
	container: #input: docker.#Image | *{
		alpine.#Build & {
			packages: {
				bash: {}
				yarn: {}
			}
		},
	}

Would that solve 3 ? I have it locally, fully working. I just cannot push to this fork atm. Here is the full implementation: https://gist.github.com/grouville/5017f830dc4d0306b64a91ee7aef4189

@grouville
Copy link
Copy Markdown
Member

grouville commented Mar 3, 2022

working

From what I read, the full implementation in the Gist above (based on this PR) respect the 4 constraints

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 3, 2022

To illustrate the "some cases", the cache is shared among ALL yarn instances (by default, unless passing a custom name).

For instance, taking our own repo as an example, when building website and docs we'd have to wait for website to complete before starting to build docs if using a locked cache.

I agree with your overall point, but in this example it's imperative to use a custom name to differentiate unrelated builds, otherwise entirely new problems start appearing. So with a proper yarn configuration, the performance speed is not quite as drastic. But yes still too slow in an entirely avoidable way.

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 3, 2022

	
	container: #input: docker.#Image | *{
		alpine.#Build & {
			packages: {
				bash: {}
				yarn: {}
			}
		},
	}

Would that solve 3 ? I have it locally, fully working. I just cannot push to this fork atm. Here is the full implementation: https://gist.github.com/grouville/5017f830dc4d0306b64a91ee7aef4189

This breaks the current pattern of customization ("look for the container key"). It also introduces a new pattern of using definitions as input fields, sometimes, but not always - so when do I use regular fields vs. definitions? ... Better not to go down this path if we can avoid it.

gerhard added 2 commits March 4, 2022 18:38
If multiple yarn.#Run commands run in parallel, they will corrupt each
other's yarn cache mount. Because we extract yarn install into a
separate step, LLB will dedup the yarn install step and only run it
once, regardless how many yarn.#Run commands run in parallel.

Fixes #1670

Signed-off-by: Gerhard Lazu <[email protected]>
The goal is to preserve the container.input interface so that custom
images can be specified, as per yarn/test/test.cue while keeping the
initial fix for #1670

Signed-off-by: Gerhard Lazu <[email protected]>
@gerhard gerhard force-pushed the extract-yarn-install-into-a-separate-step branch from f3b2119 to 26d24e6 Compare March 4, 2022 18:41
@gerhard
Copy link
Copy Markdown
Contributor Author

gerhard commented Mar 4, 2022

This breaks the current pattern of customization ("look for the container key"). It also introduces a new pattern of using definitions as input fields, sometimes, but not always - so when do I use regular fields vs. definitions? ... Better not to go down this path if we can avoid it.

I appreciate the freedom to refine this fix, and I would love to explore alternatives, but I don't feel that the current timeline allows this. If we can make it work, I am all for making this right over the next few days / weeks. Small steps towards a working, simple use-case that we can improve later is what I am thinking 😉

@gerhard gerhard merged commit 5267d72 into main Mar 4, 2022
@gerhard gerhard deleted the extract-yarn-install-into-a-separate-step branch March 4, 2022 18:48
@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 4, 2022

FYI I am going to revert this merge. We need to stop the current free for all approach on universe.

@gerhard
Copy link
Copy Markdown
Contributor Author

gerhard commented Mar 5, 2022

FYI I am going to revert this merge.

That's OK. A better solution that works for #1586 would be great.

@shykes
Copy link
Copy Markdown
Contributor

shykes commented Mar 5, 2022

FYI I am going to revert this merge.

That's OK. A better solution that works for #1586 would be great.

More constructive than reverting, I will prioritize finding a better solution. Wanted to do it yesterday but got interrupted before finishing.

@gerhard I want to make sure my tentative improvement doesn’t break your code. Where do you currently use this new interface so I can include it in my PR?

@gerhard
Copy link
Copy Markdown
Contributor Author

gerhard commented Mar 7, 2022

Thank you for making the time today, it was great to catch-up on this! #1693 is a follow-up to it all. While it is by no means finished, it is a step in the direction that we have discussed - I will continue with it first thing tomorrow.

gerhard added a commit that referenced this pull request Mar 8, 2022
The todoapp example contains a Netlify plan which uses the latest dagger
additions: do & Client API. We are thinking of merging the examples
repository into this one to make working with this easier. This is a
step in that direction.

We are not using the yarn package so that we can revert
#1673 without breaking this
implementation.

The GitHub Action is WIP, we will continue with that tomorrow:
dagger/dagger-for-github#24

Signed-off-by: Gerhard Lazu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] universe.dagger.io/yarn cache mount config results in yarn package corruption

4 participants