Skip to content

actions/checkout: parallelize checkout of multiple commits on tmpfs#435526

Merged
wolfgangwalther merged 1 commit intoNixOS:masterfrom
wolfgangwalther:ci-faster-fetch
Aug 21, 2025
Merged

actions/checkout: parallelize checkout of multiple commits on tmpfs#435526
wolfgangwalther merged 1 commit intoNixOS:masterfrom
wolfgangwalther:ci-faster-fetch

Conversation

@wolfgangwalther
Copy link
Contributor

@wolfgangwalther wolfgangwalther commented Aug 21, 2025

Instead of fetching up to 3 times on each new checkout, we now fetch all the commits we're going to need at once. Afterwards, we checkout the different worktrees in parallel, which doesn't give us much, yet, because it would still be IO-bound on its own. Inconsistent IO performance on disk is also the biggest limitation for checkout right now, where checkout times range everywhere from 20s to 40s.

By checking out the worktrees on a tmpfs, the actual checkout only takes 1s and benefits from parallelization. The overall checkout time is now 8-11s, depending on the number of commits.

That's a reduction of 10-30s and we get this speedup for almost every job in the PR workflow, which is huge.

This potentially has a nice side-effect for Eval, too: Because the repo is in RAM, Eval seems to run slightly faster, up to 10 seconds less.

Things done


Add a 👍 reaction to pull requests you find important.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions 6.topic: policy discussion Discuss policies to work in and around Nixpkgs backport release-25.05 labels Aug 21, 2025
@nixpkgs-ci nixpkgs-ci bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Aug 21, 2025
@wolfgangwalther
Copy link
Contributor Author

Just removed a left-over comment.

@wolfgangwalther wolfgangwalther mentioned this pull request Aug 21, 2025
2 tasks
@MattSturgeon
Copy link
Contributor

This seems like a good change (I'll review shortly). Just to have the answers on record, I'll ask these skeptical questions:

How much memory remains available when we have 3 worktrees checked out on a tmpfs volume? I assume this varies per runner, as they each have different memory capacities?

In the past, some jobs were running out of memory and using swap. Is this a non-issue now? Would these changes make that more likely to happen again?

@wolfgangwalther
Copy link
Contributor Author

How much memory remains available when we have 3 worktrees checked out on a tmpfs volume? I assume this varies per runner, as they each have different memory capacities?

In the past, some jobs were running out of memory and using swap. Is this a non-issue now? Would these changes make that more likely to happen again?

Yeah, this is a valid concern. When considering it, we also need to take #435535 into account, which changes it to 4 checkouts of Nixpkgs on RAM for the Eval workflow. One checkout of Nixpkgs is 300 MB for me locally. So, we'd be at 1.2G. Most runners have 16G, so that's not a problem there for almost all workflows. The MacOS runners have only 7G - but we only have a single job there. See https://docs.github.com/en/actions/reference/runners/github-hosted-runners#standard-github-hosted-runners-for-public-repositories.

The Eval workflow is generally the only workflow that would be a problem. In that later PR, I didn't see any problem with Nix 2.30 - in fact, it even became faster. No swapping observed in https://github.com/NixOS/nixpkgs/actions/runs/17124280377/job/48572236799?pr=435535. In fact, even with the 4 checkouts, the available memory was the lowest at 2374 MiB.

Now, that might be different when we update ci/pinned.json and run all the other Nix/Lix versions. But that happens so infrequently that even if we were swapping there and taking a minute more or so, it wouldn't be a big concern. I'll still do a test to see - maybe we need to lower the chunkSize a tad for these. Otherwise the performance comparison we do there as well, doesn't make much sense anymore.

@wolfgangwalther
Copy link
Contributor Author

wolfgangwalther commented Aug 21, 2025

Now, that might be different when we update ci/pinned.json and run all the other Nix/Lix versions. But that happens so infrequently that even if we were swapping there and taking a minute more or so, it wouldn't be a big concern. I'll still do a test to see - maybe we need to lower the chunkSize a tad for these. Otherwise the performance comparison we do there as well, doesn't make much sense anymore.

Here's a run of that:

Lix/Nix version comparison

Versionaarch64-linuxaarch64-darwinx86_64-linuxx86_64-darwin
lixPackageSets.git.lix1077612981
lixPackageSets.lix_2_91.lix1087712080
lixPackageSets.lix_2_92.lix1117311477
lixPackageSets.lix_2_93.lix1108213482
nixVersions.git1037110971
nixVersions.nix_2_24135112155114
nixVersions.nix_2_28139113150109
nixVersions.nix_2_29133111141109
nixVersions.nix_2_30997010972

Evaluation time in seconds without downloading dependencies.

⚠️ Job did not report a result.

❌ Job produced different outpaths than the target branch.


No problem at all, Eval is not slower. It's not faster either. Comparison here: #427724 (comment)

Since this measures the time the eval inside the sandbox takes, I assume the observation of Eval being potentially faster on tmpfs mostly happens outside the sandbox, aka when Nix goes through the whole repo to calculate the outpath for the eval result, before it checks cachix for it.


I did observe swapping for this run, but mostly around 1,3 GB (for the memory worst-case scenario, aka older Lix versions). That matches the 4 checkouts nicely - I think what happens is, that the tmpfs is actually pushed to swap first, which is entirely fine. So nix still has all the memory to do Eval.

@nixpkgs-ci nixpkgs-ci bot added 12.approvals: 2 This PR was reviewed and approved by two persons. and removed 12.approvals: 1 This PR was reviewed and approved by one person. labels Aug 21, 2025
Instead of fetching up to 3 times on each new checkout, we now fetch all
the commits we're going to need at once. Afterwards, we checkout the
different worktrees in parallel, which doesn't give us much, yet,
because it would still be IO-bound on its own. Inconsistent IO
performance on disk is also the biggest limitation for checkout right
now, where checkout times range everywhere from 20s to 40s.

By checking out the worktrees on a tmpfs, the actual checkout only takes
1s and benefits from parallelization. The overall checkout time is now
8-11s, depending on the number of commits.

That's a reduction of 10-30s and we get this speedup for almost every
job in the PR workflow, which is huge.

This potentially has a nice side-effect for Eval, too: Because the repo
is in RAM, Eval seems to run slightly faster, up to 10 seconds less.
@wolfgangwalther wolfgangwalther merged commit aea00c8 into NixOS:master Aug 21, 2025
40 checks passed
@wolfgangwalther wolfgangwalther deleted the ci-faster-fetch branch August 21, 2025 19:57
@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Aug 21, 2025

Successfully created backport PR for release-25.05:

@github-actions github-actions bot added the 8.has: port to stable This PR already has a backport to the stable release. label Aug 21, 2025
Comment on lines +87 to +92
case 'macOS':
await run('sudo', 'mount_tmpfs', path)
// macOS creates this hidden folder to log file system activity.
// This trips up git when adding a worktree below, because the target folder is not empty.
await run('sudo', 'rm', '-rf', join(path, '.fseventsd'))
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part doesn't work too well for MacOS, yet.

https://github.com/NixOS/nixpkgs/actions/runs/17147656253/job/48646939225
https://github.com/NixOS/nixpkgs/actions/runs/17146824086/job/48644655746
https://github.com/NixOS/nixpkgs/actions/runs/17146214373/job/48643006331

All the same error:

fatal: 'untrusted' already exists

(alternatively "pinned")

The problem here is that either darwin recreates the .fseventsd folder or creates other files/folders in there - which makes git unhappy.

A better fix might be to mount a single tmpfs and create subdirectories on it for the checkouts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better fix might be to mount a single tmpfs and create subdirectories on it for the checkouts.

Agreed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on a fix in #435806.

@philiptaron
Copy link
Contributor

Here's a selection of things people have done in GitHub actions to unload various daemons on macOS:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions 6.topic: policy discussion Discuss policies to work in and around Nixpkgs 8.has: port to stable This PR already has a backport to the stable release. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. 12.approvals: 2 This PR was reviewed and approved by two persons.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants