Skip to content

Conversation

@jkartseva
Copy link

@jkartseva jkartseva commented Nov 18, 2020

This PR addresses the feature request in #13496 (comment) by adding AllowBindPorts= unit property.
If the property specified a service is allowed to bind(2) only to a certain set of ports. This prevents from assigning socket address with disallowed port to a socket.
Stretch goal of this PR is to improve BPF ecosystem in systemd by making it more developer friendly and to clear path for more features implemented with BPF. Which is why BPF programs behind AllowBindPorts= feature are compiled from sources in restricted C in contrast with the present bpf instructions 'assembly' approach.
The content of PR:

  • BPF source code for AllowBind 87c2b5e
  • build-bpf.py build script generating a C header file with BPF object file represented as C array dce011d
  • custom meson build rule 5d90c14 invoking build-bpf.py
  • Lean set of helpers in shared and core libraries to support loading, attaching, populating BPF maps ed92be7 15ef74f
  • allow_bind feature interface for systemd unit: 'supported' and 'install'. 589bcdb
  • cgroup mask, fragment parser, dbus interface 56c4ae1 72215ec ca36410

The property requires compile time clang llvm (for llc) and libbpf shared lib. If any of the requirements are not satisfied, the property is not supported and the dependent code is not included into systemd build.

I'd appreciate feedback on

  • how the approach looks in general
  • possibility to make libbpf dependency static or to support git submodule approach though this is not standard way for systemd
  • CPU architecture and distribution coverage since compiler options may vary from platform to platform: which ones should be supported in the first place
  • other BPF powered features which is good to have in systemd
  • moving the present features (bpf firewall, devices) to the new rail: introducing bpf source code for them, tests for pairwise comparison, deprecation stategy
  • desired testing infra

Closes #13744

SD_BUS_PROPERTY("ManagedOOMSwap", "s", property_get_managed_oom_mode, offsetof(CGroupContext, moom_swap), 0),
SD_BUS_PROPERTY("ManagedOOMMemoryPressure", "s", property_get_managed_oom_mode, offsetof(CGroupContext, moom_mem_pressure), 0),
SD_BUS_PROPERTY("ManagedOOMMemoryPressureLimitPercent", "s", bus_property_get_percent, offsetof(CGroupContext, moom_mem_pressure_limit), 0),
SD_BUS_PROPERTY("AllowBindPorts", "as", NULL, offsetof(CGroupContext, allow_bind_ports), 0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general there's a preference for passing around structured data, rather than strings to parse - maybe this could be an int, with -1 to indicate "none"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluca IMO -1 semantics is confusing while passing "none" would be consistent with service unit config file.
Also there are IPAddress{Allow|Deny} properties with special values such as "any" and "localhost", I followed that pattern.

Copy link
Member

@bluca bluca Dec 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, with the DBUS interface we don't really aim for 1:1 mapping with unit config, but for well-structured data passing. And IPAddress is passed around as structured data (array of bytes + unsigned for mask), not as strings:

SD_BUS_PROPERTY("IPAddressAllow", "a(iayu)", property_get_ip_address_access, offsetof(CGroupContext, ip_address_allow), 0),
SD_BUS_PROPERTY("IPAddressDeny", "a(iayu)", property_get_ip_address_access, offsetof(CGroupContext, ip_address_deny), 0),

any/localhost/etc are encoded as special values - eg, 0.0.0.0/0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluca

Is integer (i.e.t dbus type specifier) forward compatible?
It keep the initial program simple and to reduce the complexity of this PR, AllowBindPorts= accepts a single integer or none.
Still I'd like to add range support as follow-up, e.g. AllowBindPorts=5555:7777 which will require a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be two integers. You can immediately add a second one, but ignore it for now - that way when it's ready you can use it for ranges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, please use the correct type here, not strings. dbus is all about type-safety, and strings are about "everything goes", hence break tht design goal of dbus.

now, dbus isn't good at having nice ways to extend things in a compatible way. it usually pays of to think in advance how thing are going to be extended, and then use the appropriate types for that. and if that doesn't work out and one has to change type the usual approach is to introduce a new property that has the new type, and both the old and the new continue to be exposed, showing the same data as much as that's possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw what about ipv4 vs. ipv6? here? eventually people probably want to list ports per protocol...

maybe encode "none" as single entry of value "0" (which is after all not a valid port, i.e. if the only allowed port is an invalid one i think the meaning is clear)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poettering

Does <protocol>:<port> format, e.g. IPv6:5555 look good? This may be extended to <protocol>:<port_min>-<port_max>, e.g.IPv6:5555-7777 as port range support is added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use an array of structures of three integers: the address family to use (AF_INET or AF_INET6), and the port start nr, plus number of ports. the dbus type system is easily flexible enough to express that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poettering @bluca

Hey, dbus is switched to (address_family, nr_ports, port_min) three tuple.

@bluca
Copy link
Member

bluca commented Nov 18, 2020

possibility to make libbpf dependency static or to support git submodule approach though this is not standard way for systemd.

Please, don't. Either libbpf is ready for production and thus provides a stable interface, and then it can be depended upon like any other system library that's already in use, or it isn't and hence it should not be used in core projects like systemd at all.

In fact, looking at how/where it's used I'd say it's a good candidate for dynamic loading via dlopen at runtime - please see how other libraries like cryptsetup are handled, eg: https://github.com/systemd/systemd/blob/master/src/shared/cryptsetup-util.c

@jkartseva jkartseva changed the title RFC: introduce AllowBindPorts= property provided by compiled from sources RFC: introduce AllowBindPorts= property provided by bpf compiled from sources Nov 18, 2020
@jkartseva jkartseva force-pushed the bpf-build-rule branch 2 times, most recently from a7b9208 to 156cb94 Compare November 20, 2020 02:19
@jkartseva
Copy link
Author

jkartseva commented Nov 20, 2020

@bluca

Please, don't. Either libbpf is ready for production and thus provides a stable interface, and then it can be depended upon like any other system library that's already in use, or it isn't and hence it should not be used in core projects like systemd at all.

I think API/ABI compatibility is a lesser concern since it's fully in control of kernel community.
Relying on distribution packaging though means that there will be a time lag between top-notch features available and features which can be used in systemd. A user space developer has to keep in mind which BPF feature available on which kernel, dynamic linking adds another feature-to-distribution mapping on top of that.

That means we can't introduce features which are both mandatory and recent.

In fact, looking at how/where it's used I'd say it's a good candidate for dynamic loading via dlopen at runtime - please see how other libraries like cryptsetup are handled, eg: https://github.com/systemd/systemd/blob/master/src/shared/cryptsetup-util.c

Thanks for the pointer, taking a look!

@jkartseva
Copy link
Author

jkartseva commented Nov 21, 2020

Converting to draft to add feature gate in meson_options.txt, attach type probing and to document better.
Also investigating compilation failure in Arch Linux

@jkartseva jkartseva marked this pull request as draft November 21, 2020 03:24
@bluca
Copy link
Member

bluca commented Nov 23, 2020

@bluca

Please, don't. Either libbpf is ready for production and thus provides a stable interface, and then it can be depended upon like any other system library that's already in use, or it isn't and hence it should not be used in core projects like systemd at all.

I think API/ABI compatibility is a lesser concern since it's fully in control of kernel community.
Relying on distribution packaging though means that there will be a time lag between top-notch features available and features which can be used in systemd. A user space developer has to keep in mind which BPF feature available on which kernel, dynamic linking adds another feature-to-distribution mapping on top of that.

That means we can't introduce features which are both mandatory and recent.

Speaking as a downstream once-removed distro maintainer at $work, in my opinion this is not a problem at all - in fact, it's entirely expected. It's how every other existing dependency works - I'm not seeing anything that indicates this is special in any way to warrant a different, custom treatment for this one option. As long as every requirement is clearly indicated, it's fully expected that optional features do not work if the minimum build and runtime requirements are not met.
For example, to make use of dm-verity signatures (RootHashSignature, etc), you need at least kernel >= 5.4 with the relevant kconfig enabled (disabled by default), and a more recent version of libcryptsetup than the minimum requirement. That is fine, because it's clearly documented: https://github.com/systemd/systemd/blob/master/README and so are plenty of other kernel-related or libs-related features (user namespaces, network features, etc etc).

In fact, looking at how/where it's used I'd say it's a good candidate for dynamic loading via dlopen at runtime - please see how other libraries like cryptsetup are handled, eg: https://github.com/systemd/systemd/blob/master/src/shared/cryptsetup-util.c

Thanks for the pointer, taking a look!

@jkartseva jkartseva marked this pull request as ready for review November 28, 2020 01:29
@jkartseva jkartseva marked this pull request as draft November 28, 2020 02:14
@jkartseva
Copy link
Author

@bluca

Speaking as a downstream once-removed distro maintainer at $work, in my opinion this is not a problem at all - in fact, it's entirely expected. It's how every other existing dependency works - I'm not seeing anything that indicates this is special in any way to warrant a different, custom treatment for this one option. As long as every requirement is clearly indicated, it's fully expected that optional features do not work if the minimum build and runtime requirements are not met.

Listing trade-offs I can think about.

Static

(+) Ability to introduce new libbpf features is not limited by presence of recent versions in distributions.
(+) End users don't need to worry about installing and updating libbpf on every host where systemd runs, only on those where it builds.
(+) Features can be declared mandatory on 4.9+ kernels.
(-) Increases binary size and compile time.

Shared

Basically, swap pros and cons of static:
(+) No binary size and compile time overhead.
(+) Standard for systemd development.
(-) Package availability in distros is a requirement.
(-) End users have to worry about one more package
(-) Only optional features.

In the patch set I also included libbpf to libsystemd-shared bundle:

$ ldd ./src/shared/libsystemd-shared-247.so
	linux-vdso.so.1 (0x00007ffcb9902000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f6c5ec60000)
	libbpf.so.0 => /lib64/libbpf.so.0 (0x00007f6c5ec31000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f6c5ec2a000)
	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f6c5ebef000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f6c5eb8f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f6c5eb84000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f6c5ea3c000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f6c5ea35000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6c5ea13000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f6c5e849000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f6c5f00e000)
	libelf.so.1 => /lib64/libelf.so.1 (0x00007f6c5e82e000)
	libz.so.1 => /lib64/libz.so.1 (0x00007f6c5e814000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f6c5e7e5000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f6c5e74c000)

I think this straightens dynamic linking approach by simplifying libbpf delivery to end users. Wdyt?

@bluca
Copy link
Member

bluca commented Nov 29, 2020

@bluca

Speaking as a downstream once-removed distro maintainer at $work, in my opinion this is not a problem at all - in fact, it's entirely expected. It's how every other existing dependency works - I'm not seeing anything that indicates this is special in any way to warrant a different, custom treatment for this one option. As long as every requirement is clearly indicated, it's fully expected that optional features do not work if the minimum build and runtime requirements are not met.

Listing trade-offs I can think about.

Static

(+) Ability to introduce new libbpf features is not limited by presence of recent versions in distributions.
(+) End users don't need to worry about installing and updating libbpf on every host where systemd runs, only on those where it builds.
(+) Features can be declared mandatory on 4.9+ kernels.
(-) Increases binary size and compile time.

The problem is not so much binary size, although a few folks care about that for certain important and popular use cases, so it's certainly worth mentioning. In my view, the most important factor by some distance is security maintenance. Distributions rely on being able to update a library with a security bug without going around guessing whatever might have embedded copies all over the place. There are over fifty thousand binary packages in Debian, if every single one of them had embedded random libraries it would pretty much be game over.

Moreover, the maintainers here can correct me if I'm wrong, but my distinct impression is that the systemd distribution model is not random users installing random things - but you get and use what your distribution gives you. I do not see random end users building their own systemd. Good integration and synergy with the rest of the distro is paramaunt to a functional system. So in my opinion it's not really worth it to worry about what end users might or might not have to do. Developers testing things are qualified enough to look at the documentation, and personally I do not see a reason to believe this is particular one is any different from the other dozen or so optional dependencies.

For example, iproute2 just got support for libbpf, as a normal shared library like many others. I maintain it in Debian, and once it's released I'll enable it and ensure the proper requirements are set, and that will be it - no reason to do anything special, as libbpf is just another dependency like libmnl or libelf or libselinux etc etc

Shared

Basically, swap pros and cons of static:
(+) No binary size and compile time overhead.
(+) Standard for systemd development.
(-) Package availability in distros is a requirement.
(-) End users have to worry about one more package
(-) Only optional features.

In the patch set I also included libbpf to libsystemd-shared bundle:

$ ldd ./src/shared/libsystemd-shared-247.so
	linux-vdso.so.1 (0x00007ffcb9902000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f6c5ec60000)
	libbpf.so.0 => /lib64/libbpf.so.0 (0x00007f6c5ec31000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f6c5ec2a000)
	libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f6c5ebef000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f6c5eb8f000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f6c5eb84000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f6c5ea3c000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f6c5ea35000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6c5ea13000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f6c5e849000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f6c5f00e000)
	libelf.so.1 => /lib64/libelf.so.1 (0x00007f6c5e82e000)
	libz.so.1 => /lib64/libz.so.1 (0x00007f6c5e814000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f6c5e7e5000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f6c5e74c000)

I think this straightens dynamic linking approach by simplifying libbpf delivery to end users. Wdyt?

There's been a trend to move optional dependencies to dlopen(), to minimize hard requirements for container images and so on. It's the maintainers call how to handle this one of course, but in my view it's a good candidate to be handled that way too. Eg, see: https://github.com/systemd/systemd/blob/master/src/shared/cryptsetup-util.c

However, it can of course happen later as a follow-up if the maintainers want to handle it that way, I don't think we need to worry about it now. Current approach looks fine.

@jkartseva jkartseva force-pushed the bpf-build-rule branch 2 times, most recently from a2fe8cf to 49a734a Compare November 29, 2020 21:01
@jkartseva jkartseva marked this pull request as ready for review November 29, 2020 21:02
@jkartseva
Copy link
Author

CentOS CI logs indicate that libbpf package in ArchLinux was bumped up 0.3, whereas the latest release is 0.2

01:26:51 archlinux_systemd_ci: Run-time dependency libbpf found: YES 0.3.0

cc @mrc0mmand: Is 0.3 the latest libbpf but versioned differently?

That also adds a point about how libbpf dependent features will work with customly versioned libbpf packages residing in private and corporate repos.

@mrc0mmand
Copy link
Member

CentOS CI logs indicate that libbpf package in ArchLinux was bumped up 0.3, whereas the latest release is 0.2

01:26:51 archlinux_systemd_ci: Run-time dependency libbpf found: YES 0.3.0

cc @mrc0mmand: Is 0.3 the latest libbpf but versioned differently?

CentOS CI uses the master branch of the libpf repo, so it's usually a newer version than the latest stable release.

@jkartseva
Copy link
Author

@bluca

Looking at dlopen, major version of shared lib is always hardcoded in search path:

find src/shared -name '*.[ch]' | xargs grep dlopen\(
src/shared/cryptsetup-util.c:        dl = dlopen("libcryptsetup.so.12", RTLD_LAZY);
src/shared/idn-util.c:        dl = dlopen("libidn2.so.0", RTLD_LAZY);
src/shared/idn-util.c:        dl = dlopen("libidn.so.12", RTLD_LAZY);
src/shared/idn-util.c:                dl = dlopen("libidn.so.11", RTLD_LAZY);
src/shared/pwquality-util.c:        dl = dlopen("libpwquality.so.1", RTLD_LAZY);
src/shared/qrcode-util.c:        dl = dlopen("libqrencode.so.4", RTLD_LAZY);
src/shared/userdb.c:        dl = dlopen(ROOTLIBDIR "/libnss_systemd.so.2", RTLD_LAZY|RTLD_NODELETE);
src/shared/userdb.c:                log_debug("Failed to dlopen(libnss_systemd.so.2), ignoring: %s", dlerror());

Bumping major in a distro will cause systemd to out-out from codepath depending on a lib.
Have it ever been an issue for any of the listed libs? Or the preferred way for systemd is to bump the major in search path early so distributions should catch up?

@bluca
Copy link
Member

bluca commented Dec 3, 2020

@bluca

Looking at dlopen, major version of shared lib is always hardcoded in search path:

find src/shared -name '*.[ch]' | xargs grep dlopen\(
src/shared/cryptsetup-util.c:        dl = dlopen("libcryptsetup.so.12", RTLD_LAZY);
src/shared/idn-util.c:        dl = dlopen("libidn2.so.0", RTLD_LAZY);
src/shared/idn-util.c:        dl = dlopen("libidn.so.12", RTLD_LAZY);
src/shared/idn-util.c:                dl = dlopen("libidn.so.11", RTLD_LAZY);
src/shared/pwquality-util.c:        dl = dlopen("libpwquality.so.1", RTLD_LAZY);
src/shared/qrcode-util.c:        dl = dlopen("libqrencode.so.4", RTLD_LAZY);
src/shared/userdb.c:        dl = dlopen(ROOTLIBDIR "/libnss_systemd.so.2", RTLD_LAZY|RTLD_NODELETE);
src/shared/userdb.c:                log_debug("Failed to dlopen(libnss_systemd.so.2), ignoring: %s", dlerror());

Bumping major in a distro will cause systemd to out-out from codepath depending on a lib.
Have it ever been an issue for any of the listed libs? Or the preferred way for systemd is to bump the major in search path early so distributions should catch up?

Not so far, but this usage is very new. The issue you raised is a valid one, and getting the soname at build time rather than hard-coding, and other issues, are currently being discussed here: #17416

This would be a problem only in case of ABI-but-not-API break (if the soname changes because of an API break, there are code changes requires anyway), which is less common, but of course very much possible, and it very much does happen.

@jkartseva
Copy link
Author

jkartseva commented Dec 4, 2020

@bluca

Thanks for the pointer once again!

Is there a rule of thumb to separate weak dependencies from critical? The present BPF-powered features in systemd are responsible for security (firewall, device control), same as AllowBind introduced with this PR. IMO graceful degradation shouldn't be the case for security.

Present features are implemented with 'assembly' bpf_insn and depend only on kernel. Favoring dlopen approach means that both firewall and devices can not be migrated to human friendly C code since this is security relaxation: not having libbpf.so on target host would be enough to opt out.

I would love to replace BPF firewall with C code and add ports support, but what's the point if it can be bypassed.

@bluca
Copy link
Member

bluca commented Dec 4, 2020

@bluca

Thanks for the pointer once again!

Is there a rule of thumb to separate weak dependencies from critical? The present BPF-powered features in systemd are responsible for security (firewall, device control), same as AllowBind introduced with this PR. IMO graceful degradation shouldn't be the case for security.

Present features are implemented with 'assembly' bpf_insn and depend only on kernel. Favoring dlopen approach means that both firewall and devices can not be migrated to human friendly C code since this is security relaxation: not having libbpf.so on target host would be enough to opt out.

I would love to replace BPF firewall with C code and add ports support, but what's the point if it can be bypassed.

Ultimately which way to go is a question for the core maintainers. Your PR is fine at the moment and it can be changed later with regards to this aspect.

My take is that folks who want to build minimal images, really mean minimal, no matter what the cost - and to them, all of these features are optional. As long as everything is nicely documented, distro builders are able to make their choices. I expect the vast majority of distros to ship with all of these, exceptions being things like Alpine and minimal builds of Yocto/Buildroot.

And as long as the failure mode is appropriate, I do not see a security issue. ie: if the options are set, but cannot be satisfied, there should be an error. For example, if you don't have libcrypsetup available and your unit sets some verity options, it will fail to start when dlopen fails.

@anitazha anitazha added the bpf label Dec 7, 2020
Copy link
Member

@anitazha anitazha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial surface level comments. Also I think BPFProgramV2 could be more descriptive; maybe LibBPFProgram to distinguish from the raw BPFProgram?

Julia Kartseva added 7 commits April 26, 2021 16:26
core: serialize socket_bind bpf links
Verify that service exited correctly if valid ports are passed to
SocketBind{Allow|Deny}=
Use `ncat` program starting a listening service binding to a specified
port, e.g.
"timeout --preserve-status -sSIGTERM 1s /bin/nc -l -p ${port} -vv"
@keszybz keszybz removed reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks needs-rebase labels Apr 27, 2021
@keszybz
Copy link
Member

keszybz commented Apr 27, 2021

Looks nice.

@keszybz keszybz merged commit 862e01d into systemd:main Apr 27, 2021
@jkartseva
Copy link
Author

cc @iaguis @mauriciovasquezbernal
#18145 and #18385 should be unblocked.

@jkartseva jkartseva deleted the bpf-build-rule branch April 28, 2021 05:23
poettering added a commit to poettering/systemd that referenced this pull request May 10, 2021
In most of our codebase when we referenced "ipv4" and "ipv6" on the
right-hand-side of an assignment, we lowercases it (on the
left-hand-side we used CamelCase, and thus "IPv4" and "IPv6"). In
particular all across the networkd codebase the various "per-protocol
booleans" use the lower-case spelling. Hence, let's use lower-case for
SocketBindAllow=/SocketBindDeny= too, just make sure things feel like
they belong together better.

(This work is not included in any released version, hence let's fix this
now, before any fixes in this area would be API breakage)

Follow-up for systemd#17655
Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for being late to the party. lgtm, but i found some nits.

assert(map_fd >= 0);

LIST_FOREACH(socket_bind_items, item, head) {
const uint32_t key = i++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this is unused?

}

void cgroup_context_remove_socket_bind(CGroupSocketBindItem **head) {
CGroupSocketBindItem *h;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to innermost scope?

return log_unit_error_errno(u, SYNTHETIC_ERRNO(EINVAL),
"Maximum number of socket bind rules=%u is exceeded", SOCKET_BIND_MAX_RULES);

obj = socket_bind_bpf__open();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i couldn#t find where this is actually defined? (i figure it's generated, but where is that? i am not smart enough...)

anyway, this doesn#t set any error?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's auto-generated skeleton, cc @wat-ze-hex for where exactly that logic lives. But yes, on error it returns NULL and doesn't set errno. We have plans to set errno for such cases soon, as part of libbpf 1.0 preparatory work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poettering @anakryiko

auto-generated *.skel.h is located in build:

find build/ -name "*.skel.h"
build/src/core/bpf/socket_bind/socket-bind.skel.h

The name of output file is set in meson.build:
https://github.com/systemd/systemd/blob/main/src/core/bpf/socket_bind/meson.build#L7

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for resurrecting this thread but having fuzzed __bpf_object__open I'm not sure at this point I like the idea of loading binary blobs via libbpf anywhere in systemd: libbpf/libbpf#390. @anakryiko @wat-ze-hex it would be great if the library could be fuzzed continuously. To that end I opened google/oss-fuzz#6608. I'd appreciate it if you could take a look. Thanks!

if (socket_bind_bpf__load(obj) != 0)
return log_unit_error_errno(u, errno, "Failed to load BPF object");

allow_map_fd = bpf_map__fd(obj->maps.sd_bind_allow);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can't fail? this returns a pre-allocated fd and never allocates one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. looking at bpf codebase this can return EINVAL?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will never fail because obj->maps.sd_bind_allow is never NULL. And after successful socket_bind_bpf__load() return is always a valid FD (>=0).

bpf_map__name(obj->maps.sd_bind_allow));

deny_map_fd = bpf_map__fd(obj->maps.sd_bind_deny);
assert(deny_map_fd >= 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

r = update_rules_map(allow_map_fd, allow);
if (r < 0)
return log_unit_error_errno(
u, r, "Failed to put socket bind allow rules into BPF map '%s'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suffix error message with : %m please, so that the error is shown

if (r < 0)
return log_unit_error_errno(
u, r, "Failed to put socket bind deny rules into BPF map '%s'",
bpf_map__name(obj->maps.sd_bind_deny));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

"Failed to resize BPF map '%s': %m", bpf_map__name(obj->maps.sd_bind_deny));

if (socket_bind_bpf__load(obj) != 0)
return log_unit_error_errno(u, errno, "Failed to load BPF object");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so socket_bind_bpf__load() ets errno, but socket_bind_bpf__open() does not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket_bind_bpf__load() doesn't set errno explicitly, so it shouldn't be checked. socket_bind_bpf__load() returns error directly.

if (!c)
return false;

return c->socket_bind_allow != NULL || c->socket_bind_deny != NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per coding style for ptrs we usually don't do explicit comparison with NULL. but doesn#t matter

char **candidates = STRV_MAKE("ncat", "nc", "netcat"), **c;
int r = 0;

STRV_FOREACH(c, candidates) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's FOREACH_STRING() btw

@poettering
Copy link
Member

hmm, i'll post a follow-up PR addressing a bunch of the items above, but see above, some things regarding errno are not clear to me...

poettering added a commit to poettering/systemd that referenced this pull request May 10, 2021
In most of our codebase when we referenced "ipv4" and "ipv6" on the
right-hand-side of an assignment, we lowercases it (on the
left-hand-side we used CamelCase, and thus "IPv4" and "IPv6"). In
particular all across the networkd codebase the various "per-protocol
booleans" use the lower-case spelling. Hence, let's use lower-case for
SocketBindAllow=/SocketBindDeny= too, just make sure things feel like
they belong together better.

(This work is not included in any released version, hence let's fix this
now, before any fixes in this area would be API breakage)

Follow-up for systemd#17655
poettering added a commit to poettering/systemd that referenced this pull request May 11, 2021
In most of our codebase when we referenced "ipv4" and "ipv6" on the
right-hand-side of an assignment, we lowercases it (on the
left-hand-side we used CamelCase, and thus "IPv4" and "IPv6"). In
particular all across the networkd codebase the various "per-protocol
booleans" use the lower-case spelling. Hence, let's use lower-case for
SocketBindAllow=/SocketBindDeny= too, just make sure things feel like
they belong together better.

(This work is not included in any released version, hence let's fix this
now, before any fixes in this area would be API breakage)

Follow-up for systemd#17655
@evverx
Copy link
Contributor

evverx commented Oct 20, 2021

CentOS CI uses the master branch of the libpf repo, so it's usually a newer version than the latest stable release.

@mrc0mmand could it be that the "dlopen" trick prevents the CI (where as far as I know ldd is used to figure out what should be installed on testbeds) from installing libbpf and the other dependencies loaded with dlopen?

@mrc0mmand
Copy link
Member

CentOS CI uses the master branch of the libpf repo, so it's usually a newer version than the latest stable release.

@mrc0mmand could it be that the "dlopen" trick prevents the CI (where as far as I know ldd is used to figure out what should be installed on testbeds) from installing libbpf and the other dependencies loaded with dlopen?

It does, but we have a "workaround" for this in test-functions:

systemd/test/test-functions

Lines 1131 to 1134 in 231c764

# A number of dependencies is now optional via dlopen, so the install
# script will not pick them up, since it looks at linkage.
for lib in libcryptsetup libidn libidn2 pwquality libqrencode tss2-esys tss2-rc tss2-mu libfido2 libbpf; do
ddebug "Searching for $lib via pkg-config"

@evverx
Copy link
Contributor

evverx commented Oct 20, 2021

@mrc0mmand thanks! Now I'm puzzled as to why the CI can't catch libbpf/libbpf#391

$ sudo ./build/test-bpf-lsm
Found cgroup2 on /sys/fs/cgroup/unified, unified hierarchy for systemd controller
libbpf: elf: skipping unrecognized data section(7) .eh_frame
...
libbpf: elf: skipping unrecognized data section(8) .eh_frame
libbpf: elf: skipping relo section(14) .rel.eh_frame for section(8) .eh_frame
btf.c:2754:23: runtime error: member access within misaligned address 0x0000009ea865 for type 'const struct btf_ext_header', which requires 4 byte alignment
0x0000009ea865: note: pointer points here
 6f 63 6b 00 9f eb 01  00 20 00 00 00 00 00 00  00 24 00 00 00 24 00 00  00 d4 05 00 00 f8 05 00  00
             ^
    #0 0x7fe0a6a55fae in btf_ext_parse_hdr /home/vagrant/libbpf/src/btf.c:2754
    #1 0x7fe0a6a5564d in btf_ext__new /home/vagrant/libbpf/src/btf.c:2798
    #2 0x7fe0a6af5301 in bpf_object__init_btf /home/vagrant/libbpf/src/libbpf.c:2636
    #3 0x7fe0a6ae6d53 in bpf_object__elf_collect /home/vagrant/libbpf/src/libbpf.c:3146
    #4 0x7fe0a6a98749 in __bpf_object__open /home/vagrant/libbpf/src/libbpf.c:6590
    #5 0x7fe0a6a989b1 in bpf_object__open_mem /home/vagrant/libbpf/src/libbpf.c:6655
    #6 0x7fe0a6adb86e in bpf_object__open_skeleton /home/vagrant/libbpf/src/libbpf.c:11052
    #7 0x81b87b in socket_bind_bpf__open_opts src/core/bpf/socket_bind/socket-bind.skel.h:54
    #8 0x81b937 in socket_bind_bpf__open src/core/bpf/socket_bind/socket-bind.skel.h:68
    #9 0x81dd9b in prepare_socket_bind_bpf ../src/core/bpf-socket-bind.c:79
    #10 0x8206a4 in bpf_socket_bind_supported ../src/core/bpf-socket-bind.c:138
    #11 0x67db63 in cg_bpf_mask_supported ../src/core/cgroup.c:3238
    #12 0x681142 in manager_setup_cgroup ../src/core/cgroup.c:3395
    #13 0x47c985 in manager_new ../src/core/manager.c:916
    #14 0x41f848 in main ../src/test/test-bpf-lsm.c:96
    #15 0x7fe0aaf9e1e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)
    #16 0x41d76d in _start (/home/vagrant/systemd/build/test-bpf-lsm+0x41d76d)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior btf.c:2754:23 in

@mrc0mmand
Copy link
Member

Hm, that's interesting, the output on my side looks a bit different, I wonder what's going on:

# /systemd-meson-build/test-bpf-lsm |& grep libbpf
libbpf: prog 'sd_bind4': failed to attach to cgroup: Bad file descriptor
libbpf: prog 'sd_restrictif_i': failed to attach to cgroup: Bad file descriptor
# grep libbpf strace.out 
openat(AT_FDCWD, "/systemd-meson-build/src/shared/libbpf.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/libbpf.so.0", O_RDONLY|O_CLOEXEC) = 3
write(2, "libbpf: prog 'sd_bind4': failed "..., 73libbpf: prog 'sd_bind4': failed to attach to cgroup: Bad file descriptor
write(2, "libbpf: prog 'sd_restrictif_i': "..., 80libbpf: prog 'sd_restrictif_i': failed to attach to cgroup: Bad file descriptor

@mrc0mmand
Copy link
Member

Ah, I see, it's because the libbpf is not compiled with ASan/UBSan in CIs. After rebuilding libbpf with CFLAGS=-fsanitize=address,undefined, the issue pops up:

# /systemd-meson-build/test-bpf-lsm  
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
btf.c:2754:21: runtime error: member access within misaligned address 0x55b6227be37f for type 'const struct btf_ext_header', which requires 4 byte alignment
0x55b6227be37f: note: pointer points here
 6e 73 65 00 9f  eb 01 00 20 00 00 00 00  00 00 00 14 00 00 00 14  00 00 00 ac 01 00 00 c0  01 00 00
             ^ 
    #0 0x7f93c54a2af2 in btf_ext_parse_hdr (/usr/lib/libbpf.so.0+0x1c2af2)
    #1 0x7f93c54a3059 in btf_ext__new (/usr/lib/libbpf.so.0+0x1c3059)
    #2 0x7f93c54d44f6 in bpf_object__init_btf (/usr/lib/libbpf.so.0+0x1f44f6)
    #3 0x7f93c54d9748 in bpf_object__elf_collect (/usr/lib/libbpf.so.0+0x1f9748)
    #4 0x7f93c550957d in __bpf_object__open (/usr/lib/libbpf.so.0+0x22957d)
    #5 0x7f93c5509c72 in bpf_object__open_mem (/usr/lib/libbpf.so.0+0x229c72)
    #6 0x7f93c5537d95 in bpf_object__open_skeleton (/usr/lib/libbpf.so.0+0x257d95)
    #7 0x55b62242256f in restrict_fs_bpf__open_opts /systemd-meson-build/src/core/bpf/restrict_fs/restrict-fs-skel.h:51:15
    #8 0x55b622422418 in restrict_fs_bpf__open /systemd-meson-build/src/core/bpf/restrict_fs/restrict-fs-skel.h:65:9
    #9 0x55b62241d686 in prepare_restrict_fs_bpf /systemd-meson-build/../build/src/core/bpf-lsm.c:61:15
    #10 0x55b62241ccd0 in lsm_bpf_supported /systemd-meson-build/../build/src/core/bpf-lsm.c:170:13
    #11 0x55b62241a8cd in main /systemd-meson-build/../build/src/test/test-bpf-lsm.c:84:13
    #12 0x7f93ca4b7b24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #13 0x55b62241a5cd in _start (/systemd-meson-build/test-bpf-lsm+0x34e5cd)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior btf.c:2754:21 in 

@evverx
Copy link
Contributor

evverx commented Oct 20, 2021

It does, but we have a "workaround" for this in test-functions:

Thinking about it a bit more, I'm not sure workarounds of this type have ever been applied to various tools building images using ldd (like dracut where the testsuite originally came from as far as I can remember), which probably means that sometimes all those features along with their dependencies just (unexpectedly) keep sitting on hosts where those images are built. Hopefully it doesn't happen very often :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.