Skip to content

pkg/seccomp: simplify and speed up isEnabled#5256

Merged
estesp merged 2 commits intocontainerd:masterfrom
kolyshkin:seccomp-enabled
Apr 19, 2021
Merged

pkg/seccomp: simplify and speed up isEnabled#5256
estesp merged 2 commits intocontainerd:masterfrom
kolyshkin:seccomp-enabled

Conversation

@kolyshkin
Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin commented Mar 24, 2021

1. pkg/seccomp: simplify and speed up isEnabled

Current implementation of seccomp.isEnabled (rooted in runc) is not
too good.

First, it parses the whole /proc/self/status, adding each key: value
pair into the map (lots of allocations and future work for garbage
collector), when using a single key from that map.

Second, the presence of "Seccomp" key in /proc/self/status merely means
that kernel option CONFIG_SECCOMP is set, but there is a need to also
check for CONFIG_SECCOMP_FILTER (the code for which exists but never
executed in case /proc/self/status has Seccomp key).

Replace all this with with one liner. Here is the entire code (copy-pasted here
mostly for the explanation it provides):

// isEnabled returns whether the kernel has been configured to support seccomp
// (including the check for CONFIG_SECCOMP_FILTER kernel option).
func isEnabled() bool {
        // Excerpts from prctl(2), section ERRORS:
        //
        // EACCES
        //      option is PR_SET_SECCOMP and arg2 is SECCOMP_MODE_FILTER, but
        //      the process does not have the CAP_SYS_ADMIN capability or has
        //      not set the no_new_privs attribute <...>.
        // <...>
        // EFAULT
        //      option is PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER, the
        //      system was built with CONFIG_SECCOMP_FILTER, and arg3 is an
        //      invalid address.
        // <...>
        // EINVAL
        //      option is PR_SET_SECCOMP or PR_GET_SECCOMP, and the kernel
        //      was not configured with CONFIG_SECCOMP.
        //
        // EINVAL
        //      option is PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER,
        //      and the kernel was not configured with CONFIG_SECCOMP_FILTER.
        // <end of quote>
        //
        // Meaning, in case these kernel options are set (this is what we check
        // for here), we will get some other error (most probably EACCES or
        // EFAULT). IOW, EINVAL means "seccomp not supported", any other error
        // means it is supported.

        return unix.Prctl(unix.PR_SET_SECCOMP, unix.SECCOMP_MODE_FILTER, 0, 0, 0) != unix.EINVAL
}

While at it, improve the IsEnabled documentation.

NOTE historically, parsing /proc/self/status was added after a concern
was raised in opencontainers/runc#471 that
prctl(PR_GET_SECCOMP, ...) can result in the calling process being
killed with SIGKILL. This is a valid concern, so the new code here
does not use PR_GET_SECCOMP at all.

2. pkg/seccomp: use sync.Once to further speed up IsEnabled

It does not make sense to check if seccomp is supported by the kernel
more than once per runtime, so let's use sync.Once to speed it up.

A quick benchmark (old implementation, before this commit, after):

BenchmarkIsEnabledOld-4           37183            27971 ns/op
BenchmarkIsEnabled-4            1252161              947 ns/op
BenchmarkIsEnabledOnce-4      666274008             2.14 ns/op

@k8s-ci-robot
Copy link
Copy Markdown

Hi @kolyshkin. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dmcgowan
Copy link
Copy Markdown
Member

/ok-to-test

Comment thread pkg/seccomp/seccomp_linux.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we potentially need to retry on EINTR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(at least) according to documentation, prctl never returns EINTR.

@kolyshkin kolyshkin changed the title pkg/seccomp: simplify isEnabled pkg/seccomp: simplify and speed up isEnabled Mar 24, 2021
@kolyshkin
Copy link
Copy Markdown
Contributor Author

Added a second commit to use sync.Once

@kzys
Copy link
Copy Markdown
Member

kzys commented Mar 25, 2021

#5267 may fix the test failure.

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I note that you already changed https://github.com/containers/common/blob/master/pkg/seccomp/supported.go and runc.. and hound indicates those two will cover all of the k8s stack once these changes are vendored in.

Current implementation of seccomp.IsEnabled (rooted in runc) is not
too good.

First, it parses the whole /proc/self/status, adding each key: value
pair into the map (lots of allocations and future work for garbage
collector), when using a single key from that map.

Second, the presence of "Seccomp" key in /proc/self/status merely means
that kernel option CONFIG_SECCOMP is set, but there is a need to _also_
check for CONFIG_SECCOMP_FILTER (the code for which exists but never
executed in case /proc/self/status has Seccomp key).

Replace all this with a single call to prctl; see the long comment in
the code for details.

While at it, improve the IsEnabled documentation.

NOTE historically, parsing /proc/self/status was added after a concern
was raised in opencontainers/runc#471 that
prctl(PR_GET_SECCOMP, ...) can result in the calling process being
killed with SIGKILL. This is a valid concern, so the new code here
does not use PR_GET_SECCOMP at all.

Signed-off-by: Kir Kolyshkin <[email protected]>
It does not make sense to check if seccomp is supported by the kernel
more than once per runtime, so let's use sync.Once to speed it up.

A quick benchmark (old implementation, before this commit, after):

BenchmarkIsEnabledOld-4           37183            27971 ns/op
BenchmarkIsEnabled-4            1252161              947 ns/op
BenchmarkIsEnabledOnce-4      666274008             2.14 ns/op

Signed-off-by: Kir Kolyshkin <[email protected]>
@kolyshkin
Copy link
Copy Markdown
Contributor Author

Rebased to current master in an attempt to fix CI failure

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Apr 16, 2021

Build succeeded.

Copy link
Copy Markdown
Member

@samuelkarp samuelkarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants