Skip to content

tests: add smoke test for systemd-nspawn#4378

Merged
poettering merged 2 commits intosystemd:masterfrom
evverx:nspawn-smoke-test
Oct 19, 2016
Merged

tests: add smoke test for systemd-nspawn#4378
poettering merged 2 commits intosystemd:masterfrom
evverx:nspawn-smoke-test

Conversation

@evverx
Copy link
Contributor

@evverx evverx commented Oct 15, 2016

Basically, this test runs:

    systemd-nspawn --register=no -D "$_root" -b
    systemd-nspawn --register=no -D "$_root" --private-network -b
    systemd-nspawn --register=no -D "$_root" -U -b
    systemd-nspawn --register=no -D "$_root" --private-network -U -b

and exports the UNIFIED_CGROUP_HIERARCHY=[yes|no], SYSTEMD_NSPAWN_USE_CGNS=[yes|no]

Inspired by

and so on :-)

@martinpitt , this test


# check cgroup namespaces
is_cgns_supported=no
if unshare -C /bin/sh -c ':'; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires util-linux >= 2.28
But:

$ uname -a
Linux ubuntu-xenial 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ dpkg -l 'util-linux'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                          Version                     Architecture                Description
+++-=============================================-===========================-===========================-===============================================================================================
ii  util-linux                                    2.27.1-6ubuntu3.1           amd64                       miscellaneous system utilities

$ sudo unshare -C
unshare: invalid option -- 'C'

I guess we can ignore this for now. @martinpitt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think I should check the /proc/1/ns/cgroup. Will fix

@evverx evverx added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Oct 15, 2016
@evverx evverx force-pushed the nspawn-smoke-test branch from 49cd41a to c31a33e Compare October 15, 2016 05:51
@evverx evverx removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Oct 15, 2016
@keszybz
Copy link
Member

keszybz commented Oct 16, 2016

One of the combinations (SYSTEMD_NSPAWN_USE_CGNS=yes -U --private-network) does not work for me here, so I added a patch to skip that test. After all outstanding pull requests go in, it should be fixed and we can enable it again.

I also reworked the code a bit to make it easier understand what failed, when something fails.

Copy link
Contributor Author

@evverx evverx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When anything fails, it is very hard to locate the failure

I agree. But I don't think sleep and systemd-run are the right things to do (see comments).

I guess set -x, UNIFIED= USE_NSPAWN= systemd-nspawn improve the situtation

run no yes
run yes yes
systemd-run -E UNIFIED_CGROUP_HIERARCHY=$1 -E SYSTEMD_NSPAWN_USE_CGNS=$2 -E SYSTEMD_LOG_LEVEL=debug \
--wait systemd-nspawn --quiet --register=no -M container -b --private-network
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert to use systemd-run
to have each case as a separate unit in the journal

This totally breaks journalctl -D ... -u testsuite.service

mkdir "$root/bin"
if [[ -d "$root" ]]; then
echo "$root already exists"
exit 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also sligthly simplify things by not creating separate chroot for each test
run, the same one can be reused over and over

Well, the behaviour of the -U-option depends on the $root's uid and gid. I think we should create a separate root to really test -U

set -x
# Call sleep before each container to make the logs easier to parse
systemd-run -E UNIFIED_CGROUP_HIERARCHY=$1 -E SYSTEMD_NSPAWN_USE_CGNS=$2 -E SYSTEMD_LOG_LEVEL=debug \
--wait systemd-nspawn --quiet --register=no -M container -b
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Actually, this -x helps to understand what is going on.
I guess we can

UNIFIED_CGROUP_HIERARCHY=$1 SYSTEMD_NSPAWN_USE_CGNS=$2 systemd-nspawn --register=no -D "$_root" -b

instead of the

export ..
export ..
systemd-nspawn
unset
unset
...

This fixes:

When anything fails, it is very hard to locate the failure, when everything
is run from the same script one after another.

No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, systemd-run is a complex thing. systemd-run --wait is a complex thing. And I really want to test systemd-nspawn (not systemd-run, not machined (this is why --register=no)).

run yes yes
systemd-run -E UNIFIED_CGROUP_HIERARCHY=$1 -E SYSTEMD_NSPAWN_USE_CGNS=$2 -E SYSTEMD_LOG_LEVEL=debug \
--wait systemd-nspawn --quiet --register=no -M container -b --private-network
sleep 0.5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also insert short sleeps between the cases

Well, I think this sleep prevents (maybe, potentially) the testing of the cleanup phase of the nspawn

@keszybz
Copy link
Member

keszybz commented Oct 17, 2016

In my testing of your original patch, not all logs ended up in the journal. At some point the output from the script was cut off. Maybe it's a bug, but it's not central to the problem at hand.

OK, so you don't like my changes. Can you repush with just my documentation patch, and incorporate the missing return fix into your patch?

@keszybz
Copy link
Member

keszybz commented Oct 17, 2016

... and maybe also the last patch which disables the failing test. It'll probably look different, but I think we need something like this to have the CI pass.

@evverx evverx added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Oct 17, 2016
@evverx
Copy link
Contributor Author

evverx commented Oct 17, 2016

the last patch which disables the failing test.

I think we should disable those cases: #4352

At some point the output from the script was cut off

This looks like #2913
And, indeed, sleep at the end of the script fixes this (sometimes :-))

Can you repush with just my documentation patch, and incorporate the missing return fix into your patch?

OK. I guess we should add strace and nc to the documentation too

@keszybz
Copy link
Member

keszybz commented Oct 17, 2016

#2913 is different — I was not seeing the messages at all. But it's possible I messed something up. I'll retest when you push a new version...

@evverx
Copy link
Contributor Author

evverx commented Oct 17, 2016

I was not seeing the messages at all

Oh, looks like #1812 (comment)

I don't know why but it doesn't work for me. I always lose the output of the testsuite.service.

Maybe, again

evverx and others added 2 commits October 17, 2016 16:50
Basically, this test runs:
```
    systemd-nspawn --register=no -D "$_root" -b
    systemd-nspawn --register=no -D "$_root" --private-network -b
    systemd-nspawn --register=no -D "$_root" -U -b
    systemd-nspawn --register=no -D "$_root" --private-network -U -b
```
and exports the `UNIFIED_CGROUP_HIERARCHY=[yes|no]`, `SYSTEMD_NSPAWN_USE_CGNS=[yes|no]`

Inspired by
* systemd#3589 (comment)
* systemd#4372 (comment)
* systemd#4223 (comment)
* systemd#1555

and so on :-)
@evverx evverx force-pushed the nspawn-smoke-test branch from 1612edb to a2fc3d8 Compare October 18, 2016 03:40
@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

@keszybz , updated

I've tested on:

  • Fedora 24 (4.7.6-200.fc24.x86_64)
  • Fedora 24 (vanilla 4.8)
  • Ubuntu Xenial (4.4.0-43-generic)
  • Ubuntu Yaketty (4.8.0-22-generic)
F: Failed to install ./has-overflow
autopkgtest [04:58:13]: test upstream: -----------------------]
autopkgtest [04:58:14]: test upstream:  - - - - - - - - - - results - - - - - - - - - -
upstream             FAIL non-zero exit status 1

@martinpitt ,
https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/tree/debian/tests/upstream#n23

        if ! (cd $t; TEST_BASE_DIR=../ INITRD=/initrd.img ./test.sh --$op); then

Why not

       if ! (cd $t; make $op INITRD=/initrd.img); then

?

@evverx evverx removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Oct 18, 2016
@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

@martinpitt
Copy link
Contributor

Why not make

Because this should work on an unbuilt tree as well (for testing distro packages), and then there simply is no Makefile. Does this get in the way somehow?

we don't need INITRD=/initrd.img

Thanks! Pushed: https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/commit/?id=068efe4aac

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

Because this should work on an unbuilt tree as well (for testing distro packages)

Oh, indeed, thanks! So, I can

  • install gcc on the testbed via tests/control
  • run gcc inside the test.sh

Is it OK?

@martinpitt
Copy link
Contributor

F: Failed to install ./has-overflow

I don't see a rule in test/TEST-13-NSPAWN-SMOKE/Makefile (or anywhere else) to actually build that? Or does that magically happen through some implicit rules?

I guess that's why you asked about running this through make. If the tree is not built, we could run a minimal ./configure and install the extra build deps. The approach until now was to actually build these helper binaries as part of the normal package build, and ship them in the development package; but if we can restrict the build to just these helper binaries without having to build systemd completely all over again (which takes too much time), I'm open to the above as well.

@martinpitt
Copy link
Contributor

If the tree is not built, we could run a minimal ./configure

Actually not -- test/*/Makefile runs make -C ../.. all, i. e. it first builds the entire systemd tree. This will add an extra 20 mins or so to every downstream test which is too expensive. However, with the seddery in debian/tests/upstream these make all rules are not actually required at all -- so we could just drop them. This is a good idea anyway IMHO -- you really don't want to run the entire build as root if you don't have a built tree already. WDYT?

I committed https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/commit/?id=f142ef34c for now, as this is conceptually cleaner than replicating the logic from the Makefile, and if you agree, we can remove the make all in upstream master and drop that "FIXME" again.

I can run gcc inside the test.sh

Yes, that's fine too.

I'll restart the failed tests, as with running this through make this actually ought to work now?

Also, please let me know if/where you want/need tests on Ubuntu 16.10 (I can only start them manually for now, if we want to do this permanently we need to ask @zonque to reconfigure the web hooks).

@martinpitt
Copy link
Contributor

I also started an amd64 test on 16.10, as you requested in the description.

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

we can remove the make all in upstream master and drop that "FIXME" again.

I think we don't need @make -s --no-print-directory -C ../.. all. But I don't really know how do people run our tests :-)

I'll restart the failed tests, as with running this through make this actually ought to work now?

Yes (if make and gcc installed on the testbed)

I also started an amd64 test on 16.10

Thanks!
This test doesn't fail on Yaketty. I've added an expected failure check: https://github.com/systemd/systemd/pull/4378/files#diff-16c7d2b78656b2bcbee324db72cd948fR97

#4378 (comment)

... and maybe also the last patch which disables the failing test. It'll probably look different, but I think we need something like this to have the CI pass.

Updated the PR description.

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

Also, please let me know if/where you want/need tests on Ubuntu 16.10

I believe we need tests on Ubuntu 16.10 (Ubuntu 16.04 and CentOS 7 don't support the unified cgroup hierarchy for example, so, we don't really touch the "unified" hierarchy code paths)

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

@martinpitt ,

TEST CLEANUP: systemd-nspawn smoke test
cc -O0    has-overflow.c   -o has-overflow
make: cc: Command not found

Can we add gcc to the https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/tree/debian/tests/control#n86 ?

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

Actually, the has-overflow is the workaround for #4352 (I can remove gcc and has-overflow later)

But I think we should merge this test "as is". This helps to detect other regressions (for example, see #4395, #4372)

@martinpitt
Copy link
Contributor

Test deps added (https://anonscm.debian.org/cgit/pkg-systemd/systemd.git/commit/?id=5fc1668) and tests restarted.

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

  • xenial-i386
========== TEST-13-NSPAWN-SMOKE ==========
make: Entering directory '/tmp/autopkgtest.lfYvPR/build.SnX/systemd-debian/test/TEST-13-NSPAWN-SMOKE'
TEST CLEANUP: systemd-nspawn smoke test
cc -O0    has-overflow.c   -o has-overflow
<builtin>: recipe for target 'has-overflow' failed
make: Leaving directory '/tmp/autopkgtest.lfYvPR/build.SnX/systemd-debian/test/TEST-13-NSPAWN-SMOKE'
make: cc: Command not found
make: *** [has-overflow] Error 127
autopkgtest [09:32:10]: test upstream: -----------------------]
autopkgtest [09:32:10]: test upstream:  - - - - - - - - - - results - - - - - - - - - -
upstream             FAIL non-zero exit status 1

I'm not sure why "make: cc: Command not found"

  • yakkety-amd64
yakkety-amd64
upstream             PASS # expected
boot-and-services    FAIL # unrelated

-- Logs begin at Tue 2016-10-18 09:42:55 UTC, end at Tue 2016-10-18 09:43:13 UTC. --
Oct 18 09:43:12 autopkgtest systemd[1]: Starting Container c1...
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Selected user namespace base 239403008 and range 65536.
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Failed to mount n/a on /var/lib/machines/c1/sys/fs/selinux (MS_BIND ""): No such file or directory
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Failed to mount n/a on /var/lib/machines/c1/sys/fs/selinux (MS_RDONLY|MS_NOSUID|MS_NODEV|MS_NOEXEC|MS_REMOUNT|MS_BIND ""): Invalid argument
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Timezone Etc/UTC does not exist in container, not updating container timezone.
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Failed to mount cgroup on /sys/fs/cgroup/net_cls,net_prio (MS_NOSUID|MS_NODEV|MS_NOEXEC "net_cls,net_prio"): No such file or directory
Oct 18 09:43:12 autopkgtest systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Oct 18 09:43:12 autopkgtest systemd-nspawn[1293]: Child died too early.
Oct 18 09:43:12 autopkgtest systemd[1]: Failed to start Container c1.
Oct 18 09:43:12 autopkgtest systemd[1]: [email protected]: Unit entered failed state.
Oct 18 09:43:12 autopkgtest systemd[1]: [email protected]: Failed with result 'exit-code'.

Well, #4310 isn't working now (we should bump to 232). So, the systemd-nspawn tries to mount the legacy hierarchy and boom #4352 :-)

  • xenial-amd64 - PASSED :-)

@martinpitt
Copy link
Contributor

xenial-i386> this is still the previous run before I added the gcc/make test deps, retried.

yakkety-amd64 is expected (nspawn failure) as you said. As long as we don't regularly run tests on 16.10, this PR is fine then -- but I figure we should soon clean that up as we actually want to run tests on 16.10 soon to test cgroups v2 on a current kernel?

@evverx
Copy link
Contributor Author

evverx commented Oct 18, 2016

yakkety-amd64 is expected (nspawn failure) as you said.

Oh, I was wrong. There is no systemd in the container (and bump to 232 will not fix the test).
@keszybz , do we really want to mount the legacy hierarchy if we can't find lib/systemd/libsystemd-shared-*.so and usr/lib/systemd/libsystemd-shared-*.so?

should soon clean that up as we actually want to run tests on 16.10 soon to test cgroups v2 on a current kernel?

Yes, we should fix #4352. (or we should use SYSTEMD_NSPAWN_USE_CGNS=no as a workaround)
Or we can disable -U. I'm not sure that -U by default is the right thing to do

@martinpitt
Copy link
Contributor

There is no systemd in the container

Right, it's just a teensy busybox one; I'd like to keep this as we shouldn't make too many assumptions on what is in a container. We can also test a full OS container if we can get an image easily (like downloading the standard LXD ones).

@keszybz
Copy link
Member

keszybz commented Oct 18, 2016

@keszybz , do we really want to mount the legacy hierarchy if we can't find lib/systemd/libsystemd-shared-.so and usr/lib/systemd/libsystemd-shared-.so?

IIUC, the hybrid hierarchy will break older systemd releases and probably other code which only supports legacy. So I think we have no choice to do hybrid only when we can detect that the container system supports it.

@keszybz
Copy link
Member

keszybz commented Oct 18, 2016

FWIW, I think we should merge this, even if some hacks are required to tests pass, and do incremental improvements later. We have too many unmerged patches which touch the same area and it's just hard to follow.

@martinpitt
Copy link
Contributor

No objection from me either -- the packaging has been fixed, so tests succeed on xenial, i. e. this won't break other PRs.

@evverx
Copy link
Contributor Author

evverx commented Oct 19, 2016

@keszybz

IIUC

Sorry, I asked the wrong question. I mean, why do we mount the systemd hierarchy at all if there is no systemd inside the container?

the hybrid hierarchy will break older systemd releases

This is true, of course.

Can I press the merge button? :-)

@poettering poettering merged commit 0afc6d4 into systemd:master Oct 19, 2016
@evverx evverx deleted the nspawn-smoke-test branch October 20, 2016 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants