Skip to content

pid1: make find_executable() work with MountAPIVFS=no#20515

Merged
yuwata merged 4 commits intosystemd:mainfrom
yuwata:pid1-mount-apivfs-no
Aug 25, 2021
Merged

pid1: make find_executable() work with MountAPIVFS=no#20515
yuwata merged 4 commits intosystemd:mainfrom
yuwata:pid1-mount-apivfs-no

Conversation

@yuwata
Copy link
Member

@yuwata yuwata commented Aug 22, 2021

Fixes #20514.

@bluca

This comment has been minimized.

@bluca bluca added the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Aug 22, 2021
@yuwata yuwata force-pushed the pid1-mount-apivfs-no branch from f52a06a to e49f0e5 Compare August 22, 2021 22:32
@yuwata yuwata removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Aug 22, 2021
@yuwata yuwata force-pushed the pid1-mount-apivfs-no branch 2 times, most recently from fda1e45 to f3b8186 Compare August 23, 2021 07:43
return;
}
}
if (find_executable("/lib64/libc.so.6", NULL) < 0) {
Copy link
Member

@bluca bluca Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit too fragile - while unlikely to change, the number of linked dependencies of a binary is not a stable api. We are building minimal images (chasing dependencies), could that be used to reproduce the same test instead? The /usr/share/minimal.raw stuff used in some other tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... But, could you suggest more in detail? I am not familiar with RootImage/RootDirectory=... In which test should I add the test? You mean the test should be added in TEST-50-DISSECT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could still be part of the same test maybe? You wouldn't be able to use RootImage= because that implies MountAPIVFS, and if the point of the test is to disable that, it couldn't work.
So it would depend instead of minimal.raw, but rather than using it directly, it would unsquashfs and bind mount its /usr instead of the individual libraries. I think in principle this could work?

@yuwata yuwata force-pushed the pid1-mount-apivfs-no branch 2 times, most recently from a1b6df1 to cfad1f9 Compare August 23, 2021 13:45
@yuwata
Copy link
Member Author

yuwata commented Aug 23, 2021

@bluca I tried to implement the test in another way: resolve linked libraries, and create drop-in. PTAL.

@yuwata yuwata force-pushed the pid1-mount-apivfs-no branch from cfad1f9 to 417d8c0 Compare August 23, 2021 17:08
@yuwata
Copy link
Member Author

yuwata commented Aug 24, 2021

Now all green! PTAL.

@yuwata yuwata force-pushed the pid1-mount-apivfs-no branch from 417d8c0 to 42867df Compare August 25, 2021 17:56
Copy link
Member

@bluca bluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bluca bluca added good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed and removed needs-review labels Aug 25, 2021
@yuwata yuwata merged commit d11ff2a into systemd:main Aug 25, 2021
@yuwata yuwata deleted the pid1-mount-apivfs-no branch August 25, 2021 21:05
andir added a commit to andir/nixpkgs that referenced this pull request Aug 30, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <[email protected]>
@flokli flokli mentioned this pull request Aug 31, 2021
2 tasks
andir added a commit to andir/nixpkgs that referenced this pull request Sep 12, 2021
This updates systemd to version v249.4 from version v247.6.

Besides the many new features that can be found in the upstream
repository they also introduced a bunch of cleanup which ended up
requiring a few more patches on our side.

a) 0022-core-Handle-lookup-paths-being-symlinks.patch:
  The way symlinked units were handled was changed in such that the last
  name of a unit file within one of the unit directories
  (/run/systemd/system, /etc/systemd/system, ...) is used as the name
  for the unit. Unfortunately that code didn't take into account that
  the unit directories themselves could already be symlinks and thus
  caused all our units to be recognized slightly different.

  There is an upstream PR for this new patch:
    systemd/systemd#20479

b) The way the APIVFS is setup has been changed in such a way that we
   now always have /run. This required a few changes to the
   confinement tests which did assert that they didn't exist. Instead of
   adding another patch we can just adopt the upstream behavior. An
   empty /run doesn't seem harmful.

   As part of this work I refactored the confinement test just a little
   bit to allow better debugging of test failures. Previously it would
   just fail at some point and it wasn't obvious which of the many
   commands failed or what the unexpected string was. This should now be
   more obvious.

c) Again related to the confinement tests the way a file was tested for
   being accessible was optimized. Previously systemd would in some
   situations open a file twice during that check. This was reduced to
   one operation but required the procfs to be mounted in a units
   namespace.

   An upstream bug was filed and fixed. We are now carrying the
   essential patch to fix that issue until it is backported to a new
   release (likely only version 250). The good part about this story is
   that upstream systemd now has a test case that looks very similar to
   one of our confinement tests. Hopefully that will lead to less
   friction in the long run.

   systemd/systemd#20514
   systemd/systemd#20515

d) Previously we could grep for dlopen( somewhat reliably but now
   upstream started using a wrapper around dlopen that is most of the
   time used with linebreaks. This makes using grep not ergonomic
   anymore.

   With this bump we are grepping for anything that looks like a
   dynamic library name (in contrast to a dlopen(3) call) and replace
   those instead. That seems more robust. Time will tell if this holds.

   I tried using coccinelle to patch all those call sites using its
   tooling but unfornately it does stumble upon the _cleanup_
   annotations that are very common in the systemd code.

e) We now have some machinery for libbpf support in our systemd build.
   That being said it doesn't actually work as generating some skeletons
   doesn't work just yet. It fails with the below error message and is
   disabled by default (in both minimal and the regular build).

   > FAILED: src/core/bpf/socket_bind/socket-bind.skel.h
   > /build/source/tools/build-bpf-skel.py --clang_exec /nix/store/x1bi2mkapk1m0zq2g02nr018qyjkdn7a-clang-wrapper-12.0.1/bin/clang --llvm_strip_exec /nix/store/zm0kqan9qc77x219yihmmisi9g3sg8ns-llvm-12.0.1/bin/llvm-strip --bpftool_exec /nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool --arch x86_64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
   > libbpf: elf: socket_bind_bpf is not a valid eBPF object file
   > Error: failed to open BPF object file: BPF object format invalid
   > Traceback (most recent call last):
   >   File "/build/source/tools/build-bpf-skel.py", line 128, in <module>
   >     bpf_build(args)
   >   File "/build/source/tools/build-bpf-skel.py", line 92, in bpf_build
   >     gen_bpf_skeleton(bpftool_exec=args.bpftool_exec,
   >   File "/build/source/tools/build-bpf-skel.py", line 63, in gen_bpf_skeleton
   >     skel = subprocess.check_output(bpftool_args, universal_newlines=True)
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 424, in check_output
   >     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
   >   File "/nix/store/81lwy2hfqj4c1943b1x8a0qsivjhdhw9-python3-3.9.6/lib/python3.9/subprocess.py", line 528, in run
   >     raise CalledProcessError(retcode, process.args,
   > subprocess.CalledProcessError: Command '['/nix/store/l6dg8jlbh8qnqa58mshh3d8r6999dk0p-bpftools-5.13.11/bin/bpftool', 'g', 's', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 255.
   > [102/1457] Compiling C object src/journal/libjournal-core.a.p/journald-server.c.oapture output)put)ut)
   > ninja: build stopped: subcommand failed.

  f) We do now have support for TPM2 based disk encryption in our
     systemd build. The actual bits and pieces to make use of that are
     missing but there are various ongoing efforts in that direction.
     There is also the story about systemd in our initrd to enable this
     being used for root volumes. None of this will yet work out of the
     box but we can start improving on that front.

  g) FIDO2 support was added systemd and consequently we can now use
     that. Just with TPM2 there hasn't been any integration work with
     NixOS and instead this just adds that capability to work on that.

Co-Authored-By: Jörg Thalheim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed pid1

Development

Successfully merging this pull request may close these issues.

Unit fails with EXIT_EXEC / NAMESPACE after upgrade von v247 to v249 / v248

2 participants