seccomp: whitelist io-uring related system calls#39415
seccomp: whitelist io-uring related system calls#39415thaJeztah merged 1 commit intomoby:masterfrom omegacoleman:master
Conversation
|
Looks like CI is failing, because the json version was not updated as part of this PR; |
|
ping @justincormack @crosbymichael PTAL |
I made a mistake thinking the json file is manually written, and it should be generated. I have fixed that now but some of the ci seems to be failing because of removing some non-existing folder (C:\go), what does that mean? |
|
Thanks! The RS5 failure can be ignored (see #39403) |
The only failure left is TestAPISwarmServicesMultipleAgents in janky, which does not seem to be relevant. I read the code but it is not about seccomp. |
|
yes, looks like that one is sometimes flaky; #23626 I'll restart CI |
This time it's only a few SKIPs & XFAILs, why is it marked as failing again? |
|
Looks to be an issue with the docker-py tests; I'll kick that build |
There was a problem hiding this comment.
Can you sort these alphabetically like the rest please
|
Are there any other new syscalls since the last update? |
|
ping @omegacoleman could you answer @justincormack's question (and sort the list #39415 (comment))? |
The precise answer is "I don't know" since I'm just a network dev not kernel dev, but even though there are, the new system call has to be added to the upstream libseccomp's syscall translation (name -> number) before it could be added to the config here, and these calls were added. Sorry for the delay as it is well, trival & as the upstream has added the calls I was able to just override the config myself.. |
Codecov Report
@@ Coverage Diff @@
## master #39415 +/- ##
=========================================
Coverage ? 37.34%
=========================================
Files ? 609
Lines ? 45200
Branches ? 0
=========================================
Hits ? 16882
Misses ? 26030
Partials ? 2288 |
|
Please sign your commits following these rules: $ git clone -b "master" [email protected]:omegacoleman/moby.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354606984
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -fAmending updates the existing PR. You DO NOT need to open a new one. |
thaJeztah
left a comment
There was a problem hiding this comment.
Thanks for updating; could you try rebasing the PR to get rid of the merge commit?
Signed-off-by: youcai <[email protected]>
done |
|
windows failure is unrelated; let me get this one in |
|
This seems like it's a bad idea -- io_uring allows programs to execute certain syscalls without being limited by seccomp. Currently all of the syscalls you can access through io_uring are permitted in the default profile, but I'm a little worried that this wasn't brought up at the time. |
|
@cyphar it looks like I am reviewing the whole seccomp policy based on what has been useful in the past. |
|
Funnily enough, I found this PR after commenting on that LWN article. The problem with that restriction mechanism is that it doesn't really fulfil the seccomp usecase (as long as you can create a new |
From what I got from the news they're working on sth like using eBPF to filter the SQEs upon ktail update. Will try to make that configurable after kernel & libseccomp get that feature. |
|
Shall we revert this until we find the solution? |
|
I don't think it's necessary to revert. The default profile allows everything that you can do with |
This syncs the seccomp profile with changes made to containerd's default profile in [1]. The original containerd issue and PR mention: > Security experts generally believe io_uring to be unsafe. In fact > Google ChromeOS and Android have turned it off, plus all Google > production servers turn it off. Based on the blog published by Google > below it seems like a bunch of vulnerabilities related to io_uring can > be exploited to breakout of the container. > > [2] > > Other security reaserchers also hold this opinion: see [3] for a > blackhat presentation on io_uring exploits. For the record, these syscalls were added to the allowlist in [4]. [1]: containerd/containerd@a48ddf4 [2]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html [3]: https://i.blackhat.com/BH-US-23/Presentations/US-23-Lin-bad_io_uring.pdf [4]: moby#39415 Signed-off-by: Albin Kerouanton <[email protected]>
|
for future visitors; these were removed again in; |
This syncs the seccomp profile with changes made to containerd's default profile in [1]. The original containerd issue and PR mention: > Security experts generally believe io_uring to be unsafe. In fact > Google ChromeOS and Android have turned it off, plus all Google > production servers turn it off. Based on the blog published by Google > below it seems like a bunch of vulnerabilities related to io_uring can > be exploited to breakout of the container. > > [2] > > Other security reaserchers also hold this opinion: see [3] for a > blackhat presentation on io_uring exploits. For the record, these syscalls were added to the allowlist in [4]. [1]: containerd/containerd@a48ddf4 [2]: https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html [3]: https://i.blackhat.com/BH-US-23/Presentations/US-23-Lin-bad_io_uring.pdf [4]: moby/moby#39415 Signed-off-by: Albin Kerouanton <[email protected]>
closes #39396
- What I did
I added the io-uring related system call introduced in kernel 5.1 to the seccomp whitelist.
- How I did it
This is merely a follow-up to the upstream
- How to verify it
Build docker with the master branch libseccomp, and will be run programs using io-uring without trouble.
With older kernels or older versions of libseccomp, the configure will be omitted.
- Description for the changelog
seccomp: whitelist io-uring related system calls
- A picture of a cute animal (not mandatory but encouraged)
Would be a waste for such a simple change..