Clean up daemon handling in the tests#5753
Conversation
57227e1 to
26bd675
Compare
|
What was the issue here? I re-run |
|
@edolstra At one point everything was passing but |
26bd675 to
9131e1a
Compare
|
OK updated with lots more stuff. Curious if CI will agree with me, but seems good locally. |
|
macOS build timed it out? |
|
Probably the flaky build issue in #3605 |
|
Thanks |
9131e1a to
551675d
Compare
|
Rebased, hoping won't get that spurious error this time. |
|
@edolstra Does this look good to you? I would like to start pushing on this and other testing improvements again prior to getting back to fun feature work like RFC 92, so we have a firmer foundation before any more (potentially destabilizing) new feature. |
tomberek
left a comment
There was a problem hiding this comment.
Test failures (repair.sh.test) are unrelated. Re-based onto master and functions as expected.
thufschmitt
left a comment
There was a problem hiding this comment.
I very much like the idea behind this.
As an alternative (maybe simpler, but also maybe stupid because I didn’t give it a lot of thoughs) scheme, how about
- Making
common.shpure again (remove thestartDaemoncall from it) - Start the daemon in
init.sh(and source it rather than execute it inmk/run_tests.shso that the needed variables are properly propagated)
?
|
@thufschmitt I had thought about that, but I decided you (probably) had an intent of |
bec7803 to
7a6a87e
Compare
|
@thufschmitt rebased! |
|
🎉 All dependencies have been resolved ! |
| set -eu -o pipefail | ||
|
|
||
| # Don't start the daemon | ||
| source common/vars-and-functions.sh |
There was a problem hiding this comment.
@thufschmitt to answer the question you asked on IRC a bit better, I didn't put the starting the daemon in init for these reasons:
- Running tests without recalling init should still work. if one wants to examine state: the test itself starts and stops the daemon, rather than something else starting and it stopping.
init.shmerely does cleanup and setup of files. Besides removing stuff, it just creates new blank filesystem state. Stopping the daemon before removing the files is just for any rogue daemon, before it becomes harder to stop because the socket file was deleted.
The file is split because init.sh still needs the env vars.
There was a problem hiding this comment.
Oh yes of course, it’s nice to be able to manually run init.sh once and then the test (I actually do that all the time)
7a6a87e to
273984c
Compare
|
@edolstra @thufschmitt approved. Is this OK with you? |
|
@edolstra pinging again. It really would be good to merge this. It doesn't commit us to doing any more testing, it just makes the existing testing more robust. |
273984c to
f20e6b7
Compare
|
I was inspired to return to this because #7600 has a devilishly hard to debug issue because it only cropped up in the daemon tests, and I was reminded how flaky that stuff is today. In the process I found that the history had become quite messy, and so I am cleaning it up. |
f20e6b7 to
3f6b136
Compare
3f6b136 to
f52667b
Compare
|
@thufschmitt Small reminder on reviewing this :). @roberth and I last night discussed some good ideas to clean stuff up further, and avoid the profusion of support files (to which this PR sadly adds one more). But I think it is good to land this in the meantime; having the daemon start and stop logic be more robust will make further cleanups easier. |
thufschmitt
left a comment
There was a problem hiding this comment.
This has been stalled for too long, indeed. Let's merge!
|
Well, now there are some conflicts 🙃 |
`init.sh` is tested on its own. We used to do that. I deleted it in 4720853 but I am not sure why. Better to just restore it; at one point working on this every other test passed, so seems good to check whether `init.sh` can be run twice. We don't *need* to run `init.sh` twice, but I want to try to make our tests as robust as possible so that manual debugging (where tests for better or worse might be run ways that we didn't expect) is less fragile.
Split `common.sh` into the vars and functions definitions vs starting the daemon (and possibly other initialization logic). This way, `init.sh` can just `source` the former. Trying to start the daemon before `nix.conf` is written will fail because `nix daemon` requires `--experimental-features 'nix-command'`. `killDaemon` is idempotent, so it's safe to call when no daemon is running. `startDaemon` and `killDaemon` use the PID (which is now exported to subshells) to decide whether there is work to be done, rather than `NIX_REMOTE`, which might conceivably be set differently even if a daemon is running. `startDaemon` and `killDaemon` can save/restore the old `NIX_REMOTE` as `NIX_REMOTE_OLD`. `init.sh` kills daemon before deleting everything (including the daemon socket).
f52667b to
87da941
Compare
|
Thanks @thufschmitt. For some reason when I rebased --- no conflicts! |
Motivation
init.shis tested on its own. We used to do that. I deleted it in 4720853 but I am not sure why. Better to just restore it; at one point working on this every other test passed, so seems good to check whetherinit.shcan be run twice.We don't need to run
init.shtwice, but I want to try to make our tests as robust as possible so that manual debugging (where tests for better or worse might be run ways that we didn't expect) is less fragile.Split
common.shinto the vars and functions definitions vs startingthe daemon (and possibly other initialization logic). This way,
init.shcan justsourcethe former. Trying to start the daemonbefore
nix.confis written will fail becausenix daemonrequires--experimental-features 'nix-command'.killDaemonis idempotent, so it's safe to call when no daemon isrunning.
startDaemonandkillDaemonuse the PID (which is now exported tosubshells) to decide whether there is work to be done, rather than
NIX_REMOTE, which might conceivably be set differently even if adaemon is running.
startDaemonandkillDaemoncan save/restore the oldNIX_REMOTEasNIX_REMOTE_OLD.init.shkills daemon before deleting everything (including the daemonsocket).
Context
Checklist for maintainers
Maintainers: tick if completed or explain if not relevant
tests/**.shsrc/*/tests