Skip to content

ci/eval: add darwin support#406307

Draft
winterqt wants to merge 2 commits intoNixOS:masterfrom
winterqt:push-lyyvlttunyvk
Draft

ci/eval: add darwin support#406307
winterqt wants to merge 2 commits intoNixOS:masterfrom
winterqt:push-lyyvlttunyvk

Conversation

@winterqt
Copy link
Member

@winterqt winterqt commented May 12, 2025

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@ofborg ofborg bot added the 6.topic: darwin Running or building packages on Darwin label May 12, 2025
@github-actions github-actions bot added 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions backport release-24.11 labels May 12, 2025
@winterqt winterqt force-pushed the push-lyyvlttunyvk branch from f5ecea6 to 0cc2922 Compare May 12, 2025 00:39
@winterqt
Copy link
Member Author

winterqt commented May 12, 2025

No idea why the code owner action can't request y'all's reviews, maybe because it's a draft? GH's API doesn't document such a limitation, though... Ah, dry mode.

Happy to split the last commit into a different PR -- in fact, it may make sense to off the bat. Let me know. Last commit has been split off.

Copy link
Member

@lilyball lilyball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor comments on the bash scripting

@winterqt winterqt force-pushed the push-lyyvlttunyvk branch from 0cc2922 to 383d569 Compare May 12, 2025 02:57
@github-actions github-actions bot added 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. labels May 12, 2025
@winterqt winterqt force-pushed the push-lyyvlttunyvk branch from 383d569 to 6678847 Compare May 12, 2025 03:07
@wolfgangwalther
Copy link
Contributor

I made some changes in my fork to run a big matrix of 4x4 jobs, each eval-system on each runner-type.

Made two runs:

I guess, I'll need to turn the sandbox off for this test. Will continue later.

@winterqt
Copy link
Member Author

Read the error in the later run, it tells you to set sandbox configuration option to relaxed — that’s what you have to do to get this to work.

@wolfgangwalther
Copy link
Contributor

Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the latest run above, this works on both x86_64-darwin and aarch64-darwin.

I'm thinking that we might want to have some way to confirm this in CI every now and then - eval should keep working on all platforms, so that all contributors can run it locally, too. I have a few ideas, but we don't need to do this in here.

@winterqt winterqt force-pushed the push-lyyvlttunyvk branch from 6678847 to 3a6da35 Compare May 12, 2025 20:31
@winterqt
Copy link
Member Author

winterqt commented May 12, 2025

Rebased. Not sure if we should wait till eval.full is fixed Darwin (without allowing bad platforms) via bumping the pinned Nixpkgs, or if we should just do it.

@wolfgangwalther
Copy link
Contributor

via bumping the pinned Nixpkgs

I'd like to do that soon anyway, as I'd like to get #405853 into the pin.

@winterqt
Copy link
Member Author

winterqt commented May 15, 2025

Dropped the x86_64-linux eval system change in favor of waiting for #406825, but everything else should be good to go now that I've updated the pin.

Made a few other changes I'd love some eyes on, including addressing @lilyball's suggestions. (Thanks!)

winterqt added 2 commits May 14, 2025 22:38
Before this change, the eval derivations would never actually complete
on any platform other than sandboxed Linux because the mem stats job
would never be terminated. For reasons I don't know, the Linux sandbox
somehow results in Bash terminating the job on its own. On every other
platform (and when the Linux sandbox is disabled), we have to do it
correctly and kill it manually.
@winterqt winterqt force-pushed the push-lyyvlttunyvk branch from a4c79d1 to 8d49a2f Compare May 15, 2025 02:38
@github-actions github-actions bot removed the 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. label May 15, 2025
@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. and removed 10.rebuild-linux: 101-500 This PR causes between 101 and 500 packages to rebuild on Linux. labels May 15, 2025
Comment on lines +88 to +99
# For some reason, `kill` (or `kill -9`) won't
# actually kill an entire process tree on macOS
# as it does on Linux. So instead, we'll just
# use xargs' native termination feature, which
# will at least guarantee that the iteration
# will be stopped.
#
# (Yes, technically we can just keep the `kill`
# command so that xargs dies and then the pending
# chunks will finish, but I'd rather just do this
# if it's going to have the same effect anyways.)
exit 255
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is a way to get this to work if there's a way to get the parent's process group ID? Not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on Discord, I believe this should actually just be kill 0 for all platforms. AFAICT kill $PPID only works on linux because of the sandbox (that causes the command to fail with an error which set -e causes the builder to exit, and presumably that causes the pid namespace to get torn down), xargs does not proactively kill its spawned processes when it dies.

If you do use kill 0, that signals the entire process group, which is going to include the shell itself since it's a non-interactive shell (which means job control defaults to off). If you want just the xargs process tree to shut down you can enable job control first with set -m, that way the xargs pipeline will be its own process group and the kill 0 will only affect it, at which point bash's set -e behavior will take over. So doing this really just depends on whether there's any benefits to log output (or whether Nix cares at all about whether the builder exits with a code or a signal).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT kill $PPID only works on linux because of the sandbox

This would match the issue with the memory stats job that I observed and also bisected down to the Linux sandbox. Fun.

Thanks, I’ll give that a shot. I don’t think we want to set -m at least without somehow explicitly detecting whether the xargs pipeline was killed or not (which I’m not sure is possible?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set -m just enables job control monitor mode, which makes every pipeline get its own process group (and bash will print a line when background jobs exit). It shouldn't actually hurt anything, it just makes bash behave a bit more like interactive mode. and you can presumably turn it off again after the xargs finishes if you like.

I'm not sure what you mean about explicitly detecting whether the xargs pipeline was killed? If job control is enabled, the kill 0 will just kill xargs (and seq if it hasn't already finished) and its child processes, and bash will treat that as an error and exit (due to set -e). If you want to intercept that, you can wrap the xargs pipeline in an if or add a || after and test the status code (and a code over 128 indicates a signal, where the signal number is the exit status minus 128, e.g. SIGTERM is 143). But normal errexit behavior should be fine here? It's what you get with kill $PPID already.

done
) &

trap "kill %%" EXIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to work if anyone adds another background job later on, since the jobspec here is evaluated at EXIT time instead of right now.

Suggested change
trap "kill %%" EXIT
trap "kill $(jobs -p %%)" EXIT

Also I just want to point out that stdenv installs a default exit handler that this overwrites. That may not matter at all, if nothing installs a failureHook or exitHook, but if you want to preserve that you could switch this over to adding both a failureHook and exitHook (or just a failureHook and adding an explicit kill after the xargs finishes).

@philiptaron philiptaron removed their request for review June 5, 2025 23:40
@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jun 16, 2025
@github-actions github-actions bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Jun 16, 2025
Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the x86_64-linux eval system change in favor of waiting for #406825, but everything else should be good to go now that I've updated the pin.

Do we still need to wait for #406825 or has the required change maybe already been merged (since I have been splitting off smaller parts repeatedly)?

Is there anything else, especially from the latest comments, that you'd want to address / change in here?

We just had #418153, too, so making eval work on darwin would be valuable to others, too.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Dec 16, 2025
@mdaniels5757 mdaniels5757 added backport release-25.11 Backport PR automatically and removed backport release-25.05 labels Jan 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: continuous integration Affects continuous integration (CI) in Nixpkgs, including Ofborg and GitHub Actions 6.topic: darwin Running or building packages on Darwin 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person. backport release-25.11 Backport PR automatically

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants