Conversation
|
@mbiebl Could you test this? |
c302f21 to
cfe3155
Compare
d3d176f to
4221f27
Compare
4221f27 to
563860e
Compare
|
Updated. @sourcejedi's comment is addressed. |
|
Some of the tests fail because |
|
@evverx at least in Debian I've gone back to statically allocate those system users: |
|
That just means that everything will stop working as soon as it has been decided that they should be turned on again. My grumpy comments aside, @mbiebl are you saying that all the tests are passing now with this patch applied? |
|
With this PR applied, some/most of the test-suite failures are fixed. test log attached: |
|
It seems to me that even though the static user is used, |
|
@evverx The Debian systemd package does allocate static system users in postinst but does not patch out |
|
@mbiebl sorry, I didn't mean to say that something is wrong with the Debian package. I don't think it should be patched out explicitly. It's just that maybe the condition |
563860e to
12af23c
Compare
|
Unfortunately networkd/timesyncd/resolved still fails with the updated PR (specifically the networkd-test.py test) |
|
Ah, I found a mistake. I will update this. |
|
I'm not sure the condition should be dropped altogether because, if I understand correctly, it's likely to reintroduce #9835. I hope @sourcejedi will correct me if I'm wrong. |
12af23c to
b0038e0
Compare
|
Updated. But not tested on any container environments. In the revised version, instead of simply dropping |
|
My understanding of the last update to this PR, is it should let the current versions of I have written a test case to represent my understanding of @evverx 's idea in #10012 (comment) : #10025. This was a more general suggestion, that would work for services that have IIRC, old-style DHCP clients save leases in a persistent file, and Macs use persistent information to speed up DHCP, they even wrote this up as an RFC. https://cafbit.com/post/rapid_dhcp_or_how_do/ . I am an outsider to But if I were using LXC, I think I might prefer to have an explicit drop-in file that disabled |
|
That comment was based on the idea of what the word "portable" means :-) As soon as the units are modified to fit the environment they're supposed to work in, they stop being really portable (though, I don't think they have ever been). I'll just repeat what I said earlier:
Regarding LXC, another option would be to update the AppArmor policy in a way similar to https://git.launchpad.net/autopkgtest-cloud/tree/lxc-slave-admin/setup-adt-lxc.commands?id=b6f01a6fd1ce3478dd18f3962c5506e2eebe2eb6#n61 (which, by the way, the only reason the tests pass on Ubuntu CI). But in this case, it should be said explicitly that, say, |
|
Just to be clear, I'm not saying that the security features shouldn't be used at all. They're actually useful in certain contexts. Anyway, I'd wait for @poettering and @keszybz to weigh in because while slowly reading all the issues related to |
b0038e0 to
ecea78e
Compare
|
I've tested this by inserting |
|
I've dropped the error handling about changing mount propagation flag, as it seems just a workaround for LXC with AppArmor, and I hope the issue can be avoidable by configuring AppArmor for LXC correctly. So, now this PR only contains several cleanups about mount namespacing. |
src/core/execute.c
Outdated
| } | ||
|
|
||
| /* Use original errno */ | ||
| return -errno; |
There was a problem hiding this comment.
urks, i can't say i like this. we generally don't use errno to pass errors, and this way we make the fact that EANO means we called unshare() right before and it failed and set errno API of setup_namespace(). But to me that's too much internal behaviour of setup_namespace() becoming API... errno handling should be hidden in the function i think and not be relied upon from outsde...
can't we find a nicer way for this?
I mean, i see where you are coming from, but this at least needs an comment on the sending side (i.e. in setup_namespace where we return EANO) and on the receiveing side, about this special error passing
There was a problem hiding this comment.
I cannot find a nicer way... Of course, adding a pointer argument e.g., bool *failure_by_unshare or something can avoid using errno. But it seems ugly.
I will add comments about that on both sender and receiver.
aaa7aa3 to
0719ab6
Compare
|
@poettering Updated. Comments are added. |
I've been wondering. Why don't we pass |
…turns -ENOANO Without this, log shows meaningless error message 'No anode', e.g., === Failed to unshare the mount namespace: Operation not permitted foo.service: Failed to set up mount namespacing: No anode foo.service: Failed at step NAMESPACE spawning /usr/bin/test: No anode === Follow-up for 1beab8b.
Setting DynamicUser= may introduce some bind mounts from RuntimeDirectory= and request mount namespacing. However, such implied bind mounts are not critical if creating mount namespace is failed.
0719ab6 to
90c02d9
Compare
|
I noticed that
@sourcejedi Both |
Hi. I've been thinking about issue #9872. [*] I realized a second reason why Part of the problem I have with issue #9872, is we don't log the path where the error happened. All we tell the user is "Failed to set up mount namespacing: %m". But now we treat most namespace errors as fatal, we could upgrade all those individual log_debug() messages into proper log_error() messages... This would change apply_mount_namespace() / setup_namespace() from functions that don't log, into ones that do. We would need [*] #9872 is about weird cases where setup_namespace() will fail the service. My original title complained it is "slightly too fragile". I felt there were too many things that might make it fail. Such a failure can happen if root made some very weird FUSE mounts. Or also, I think, some pretty weird NFS mounts. Or we can fail because SELinux is fighting a systemd change. The SELinux rule might not log itself, if the rule has |
|
@brauner I'm using a default lxc/AppArmor setup on Debian sid. |
|
@brauner we = ? |
|
@mbiebl, we == Ubuntu. |
|
@brauner ok, thanks. Would you recommend that Debian, or rather the maintainers of https://ci.debian.net/, turn off AppArmor for LXC as well? |
|
@stgraber, care to chime in for a second? |
|
@mbiebl if the default AppArmor policy were used, then |
|
@evverx ci.debian.net currently runs the Debian stretch kernel, which does not have AA enabled by default, which is why the tests currently pass. This will change with buster kernel, where AA is enabled by default. Personally, I run Debian sid, so I do have a AA enabled kernel. |
|
@mbiebl in this case a lot of tests will start to fail as soon as the kernel has been switched. The easiest way would be of course to turn it off. Though, a custom policy like https://git.launchpad.net/autopkgtest-cloud/tree/lxc-slave-admin/setup-adt-lxc.commands?id=b6f01a6fd1ce3478dd18f3962c5506e2eebe2eb6#n56 would probably be a little bit better in the sense that it's more or less real environment. |
|
Regarding #10011, are we sure that relaxing the security policy to make the security features, that, strictly speaking, aren't even needed there, work again is the best way to deal with it? |
|
As far as I know, the LXD/LXC test runners for autopkgtest in Ubuntu are using lxc.apparmor.profile=unconfined or a custom profile that's extremely close to unconfined. |
|
Hmm, maybe we could merge the first two commits now? They are uncontroversial cleanups. |
|
@keszybz OK. I will create new PRs. @sourcejedi #9872 seems not related to |
|
@yuwata thanks! |
Since 1beab8b, failure inunshare()is treated differently from othermount()failures.When remounting
/as SLAVE failed, then no mount point specified in the unit is not mounted yet. So, this makes the remounting failure be treated as the same asunshare()failure.I hope this partially fixes #9700 and #10011.