Skip to content

Conversation

@Siosm
Copy link

@Siosm Siosm commented Jul 18, 2017

This is a follow up to #6241 using the newly added RuntimeDirectoryPreserve option to start resolved and networkd unprivileged using ambient capabilities.

systemd-networkd may be started before systemd-tmpfilest thus it already
implements the logic required to create those directories.
Start under an unprivileged user, remove uid/gid dropping logic and
initial capability drop. Refuse to run as root.

Use RuntimeDirectory, RuntimeDirectoryMode and RuntimeDirectoryPreserve
to create runtime directory.
Start under an unprivileged user, remove uid/gid dropping logic and
initial capability drop. Refuse to run as root.

Use RuntimeDirectory, RuntimeDirectoryMode and RuntimeDirectoryPreserve
to create runtime directories.
@poettering
Copy link
Member

I am not sure how I feel about requiring ambient caps... We used to say that we focus on kernels from the last two years for systemd, but today actually support substantially more... 4.3 however is 22 months old now, which isn't even 2 years...

I wonder if we can't find another way for this, one that permits us to maintain compat on this one.

Here's one such idea: currently, if ExecStart='s first character is "+" we will execute the process with full privileges. Maybe we should introduce a similar, new magic char, maybe "?": if set, then the service is run with full privileges if ambient caps are not available, and otherwise with reduced caps. Then, inside of the daemon code we'd check if uid==0, and if so, drop privs as we used to, otherwise make that a NOP. This way we get great behaviour on older kernels as well as on new kernels.

I think implementing such a scheme would be relatively straight-forward... Does that make sense?

d /run/systemd/netif 0755 systemd-network systemd-network -
d /run/systemd/netif/links 0755 systemd-network systemd-network -
d /run/systemd/netif/leases 0755 systemd-network systemd-network -
)m4_dnl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be left in, so that regular userspace clients can install inotify watches on these dirs unconditionally whether networkd actually got started or didn't. After all, inotify is the way how event notifications work on the networkd client side...

Yes, strictly speaking these lines now become redundant, but only in the usual case, not in all cases

r = drop_privileges(uid, gid,
(UINT64_C(1) << CAP_NET_RAW)| /* needed for SO_BINDTODEVICE */
(UINT64_C(1) << CAP_NET_BIND_SERVICE)| /* needed to bind on port 53 */
(UINT64_C(1) << CAP_SETPCAP) /* needed in order to drop the caps later */);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer if that the comments about these three caps aren't lost, they are pretty interesting to read

@poettering
Copy link
Member

Having such fallback code for the ambient stuff in place would also be extremely useful for debugging, as one could then still invoke resolved from the shell prompt directly and get things set up the right way...

@poettering poettering added network resolve reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Jul 18, 2017
@Siosm
Copy link
Author

Siosm commented Jul 18, 2017

Unfortunately, I also remove some capabilities from the bounding set in the unit when using ambient capabilities thus in order to keep a single unit for both cases I would have to drop capabilities in both cases.

But I understand the concern. This only impacts systems running the networkd & resolved daemons and those are generally fairly recent distributions.

@poettering
Copy link
Member

Unfortunately, I also remove some capabilities from the bounding set in the unit when using ambient capabilities thus in order to keep a single unit for both cases I would have to drop capabilities in both cases.

Much like "+" the new "?" thingy would leave all bounding caps in place too. Strictly speaking this means that on old kernels we'd thus run with a wider set of caps than we do now, but that's acceptable I think.

@poettering
Copy link
Member

But I understand the concern. This only impacts systems running the networkd & resolved daemons and those are generally fairly recent distributions.

Well, people especially in the embedded world tend to use really old kernels with a much more modern userspace, and they complain already that we don't support really old stuff... cutting this off to 22 months now is just too much...

@martinpitt
Copy link
Contributor

After yesterday's networkd-test.py fix, this now seems to genuinely fail due to a regression here:

======================================================================
ERROR: test_coldplug_dhcp_ip4_only (__main__.DnsmasqClientTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/adttmp/autopkgtest-virt-lxc.shared.vqvr1x5a/downtmp/build.Dkq/debian/test/networkd-test.py", line 390, in test_coldplug_dhcp_ip4_only
    online_timeout=15)
  File "/data/adttmp/autopkgtest-virt-lxc.shared.vqvr1x5a/downtmp/build.Dkq/debian/test/networkd-test.py", line 366, in do_test
    with open(RESOLV_CONF) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/run/systemd/resolve/resolv.conf'

@martinpitt martinpitt added the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Jul 18, 2017
@Siosm
Copy link
Author

Siosm commented Jul 18, 2017

I made a mistake in the units. I'll try to fix the other issue before pushing again. Thanks

RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallArchitectures=native
ReadWritePaths=/run/systemd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required? RuntimeDirectory= implies ReadWritePaths=.

SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallArchitectures=native
ReadWritePaths=/run/systemd
RuntimeDirectory=/run/systemd/resolve
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be systemd/resolve. Full path is not allowed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

UMask=0022
WatchdogSec=3min
CapabilityBoundingSet=CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_NET_RAW CAP_NET_BIND_SERVICE
CapabilityBoundingSet=CAP_SETPCAP CAP_NET_RAW CAP_NET_BIND_SERVICE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required if AmbientCapabilities= is set?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you still need to drop the capabilities from the bounding set in all cases.

log_error("This program must be run under an unprivileged user "
"with the following capabilities: CAP_NET_ADMIN "
"CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW");
r = -EINVAL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, still we can drop privileges by drop_privileges(). I think it is not necessary to quit the program.

log_error("This program must be run under an unprivileged user"
" with the following capabilities: CAP_NET_RAW"
" CAP_NET_BIND_SERVICE CAP_SETPCAP");
r = -EINVAL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as networkd.c.

UMask=0022
WatchdogSec=3min
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW CAP_SETUID CAP_SETGID CAP_SETPCAP CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required?

SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallArchitectures=native
ReadWritePaths=/run/systemd
RuntimeDirectory=/run/systemd/netif /run/systemd/netif/links /run/systemd/netif/leases /run/systemd/netif/lldp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full path is not allowed. Why don't you create links, leases, and lldp subdirectories by the daemon?

SystemCallArchitectures=native
ReadWritePaths=/run/systemd
RuntimeDirectory=/run/systemd/netif /run/systemd/netif/links /run/systemd/netif/leases /run/systemd/netif/lldp
RuntimeDirectoryMode=0755
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but redundant.

RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallArchitectures=native
ReadWritePaths=/run/systemd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required?

SystemCallArchitectures=native
ReadWritePaths=/run/systemd
RuntimeDirectory=/run/systemd/resolve
RuntimeDirectoryMode=0755
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default value. It's OK, but redundant.

@yuwata
Copy link
Member

yuwata commented Jul 18, 2017

In PR #6328, once I've pushed similar commits, but they cause networkd-test.py failure, so I tentatively drop them. Now I pushed them in https://github.com/yuwata/systemd/tree/drop-priv . If you are interested to them, please take a look. Thank you.

@Siosm
Copy link
Author

Siosm commented Aug 8, 2017

Closing this one in favor of #6564.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR network resolve reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks

Development

Successfully merging this pull request may close these issues.

5 participants