Skip to content

seccomp: Block AF_ALG sockets in default profile (CVE-2026-31431)#52494

Merged
vvoland merged 4 commits intomoby:masterfrom
vvoland:work-fail
May 1, 2026
Merged

seccomp: Block AF_ALG sockets in default profile (CVE-2026-31431)#52494
vvoland merged 4 commits intomoby:masterfrom
vvoland:work-fail

Conversation

@vvoland
Copy link
Copy Markdown
Contributor

@vvoland vvoland commented Apr 30, 2026

CVE-2026-31431 ("Copy Fail") is a logic flaw in the kernel's algif_aead module that allows any unprivileged user with access to AF_ALG sockets to perform a controlled 4-byte page-cache write, leading to reliable local privilege escalation. The exploit is a 732-byte Python script that works on every Linux distribution shipped since 2017.

Inside a container, this allows escalation to root within the container by corrupting setuid binaries in the page cache. Since the page cache is shared across the host, corruption of shared image-layer files is also visible to other containers using the same layers on the same node.

Seccomp profile changes

The previous default seccomp profile allowed AF_ALG sockets (only AF_VSOCK was denied). This update denies both AF_ALG (38) and AF_VSOCK (40) by allowing socket creation only for address families outside that range:

  • arg0 < 38 (AF_ALG) → allow
  • arg0 == 39 (the single family between them) → allow
  • arg0 > 40 (AF_VSOCK) → allow
  • everything else (38 and 40) → falls through to default ERRNO

The previous socket rule used a single arg0 != AF_VSOCK condition. Naively adding a second OpNotEqual for AF_ALG does not work: seccomp evaluates multiple argument conditions within a single rule as a logical AND, so arg0 != 38 AND arg0 != 40 requires two comparisons against the same argument index, which libseccomp does not support reliably in one rule. Splitting into separate deny-action rules also fails because any matching allow rule takes precedence in seccomp's first-match-wins evaluation.

See moby/profiles#20 for more details.

Additionally, socketcall(2) is now explicitly denied to prevent bypassing the socket address family filters on architectures with the legacy socketcall multiplexer. See https://github.com/moby/profiles/releases/tag/seccomp%2Fv0.2.2 for details.

Integration tests

Adds TestExecSocketDenied which compiles and runs small C programs inside a container to verify that:

  • AF_ALG socket creation is denied
  • AF_VSOCK socket creation is denied
  • AF_ALG via socketcall(2) (using int $0x80 from amd64) is denied
  • AF_ALG via a cross-compiled static i386 binary is denied

Changelog

CVE-2026-31431: Block `AF_ALG` sockets and the `socketcall(2)` multiplexer in the default seccomp profile to prevent in-container privilege escalation via the kernel crypto API ("Copy Fail").

Big thanks to @tianon for helping me figure out why the initial socketcall block didn't work: https://github.com/moby/profiles/releases/tag/seccomp%2Fv0.2.2 ❤️

@corhere

This comment was marked as outdated.

Verify that AF_ALG and AF_VSOCK sockets cannot be created inside a
container running with the default seccomp profile.

The test compiles small C programs inside a debian:trixie-slim container
that attempt to create sockets with these address families, then runs
them as a non-root user (uid 1000) and asserts that socket creation is
denied with EPERM or EAFNOSUPPORT.

Signed-off-by: Paweł Gronowski <[email protected]>
@vvoland vvoland changed the title integration/container: Add test for denied socket address families seccomp: Block AF_ALG sockets in default profile (CVE-2026-31431) Apr 30, 2026
@vvoland vvoland marked this pull request as ready for review April 30, 2026 23:22
@vvoland vvoland requested a review from tianon April 30, 2026 23:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Moby’s vendored default seccomp profile to mitigate CVE-2026-31431 (“Copy Fail”) by blocking AF_ALG sockets (and hardening against socketcall(2) multiplexing bypasses), and adds integration coverage to ensure these restrictions are enforced in-container.

Changes:

  • Bump github.com/moby/profiles/seccomp vendored dependency to v0.2.2.
  • Update default seccomp rules to deny socketcall(2) and deny socket(AF_ALG) / socket(AF_VSOCK) via argument-filtered allow rules.
  • Add an integration test that compiles/runs small C programs in a container to validate the deny behavior (including socketcall and i386 compat paths).

Reviewed changes

Copilot reviewed 5 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vendor/modules.txt Updates vendored module listing for profiles/seccomp to v0.2.2.
vendor/github.com/moby/profiles/seccomp/default_linux.go Implements the updated syscall rules (deny socketcall, filter socket address families) and adds compile-time constants checks.
vendor/github.com/moby/profiles/seccomp/default.json Syncs the JSON default profile with the updated Go-generated rules.
integration/container/testdata/af_vsock.c Test helper program attempting AF_VSOCK socket creation.
integration/container/testdata/af_alg_socketcall.c Test helper program attempting AF_ALG via socketcall(2) using int $0x80.
integration/container/testdata/af_alg.c Test helper program attempting AF_ALG socket creation via normal socket(2).
integration/container/exec_afalg_linux_test.go New integration test that compiles and executes the above programs inside a container and asserts expected denial.
go.mod Bumps github.com/moby/profiles/seccomp requirement to v0.2.2.
go.sum Updates checksums for the profiles/seccomp v0.2.2 bump.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread integration/container/exec_afalg_linux_test.go Outdated
vvoland added 2 commits May 1, 2026 02:41
Test that AF_ALG is also denied through the socketcall(2) multiplexer,
which is used by glibc on i386 instead of direct socket(2) syscalls.

Two subtests:
- AF_ALG_socketcall_int80: uses int $0x80 inline assembly from a native
  64-bit binary to invoke the ia32 socketcall path, with MAP_32BIT to
  keep the args pointer below 4 GB (ia32 compat truncates registers).
- AF_ALG_socketcall_i386: cross-compiles a static i386 binary using
  gcc-i686-linux-gnu where glibc naturally routes socket() through
  socketcall(2).

Both are amd64-only.

Signed-off-by: Paweł Gronowski <[email protected]>
@vvoland
Copy link
Copy Markdown
Contributor Author

vvoland commented May 1, 2026

Failures and windows tests are unrelated.

@vvoland vvoland merged commit 0d33ada into moby:master May 1, 2026
179 of 181 checks passed
vvoland added a commit to vvoland/docker-docs that referenced this pull request May 1, 2026
Add `socket` and `socketcall` entries to the "Significant syscalls
blocked by the default profile" table to reflect the seccomp profile
changes that block AF_ALG sockets (CVE-2026-31431) and deny the
socketcall multiplexer to prevent bypassing address family filters.

Signed-off-by: Paweł Gronowski <[email protected]>
thaJeztah added a commit to docker/docs that referenced this pull request May 1, 2026
seccomp: Document AF_ALG and socketcall blocks from moby/moby#52494
@hmh
Copy link
Copy Markdown

hmh commented May 2, 2026

This apparently broke i386 containers running on an amd64 host.

How to reproduce:

BROKEN:
docker run -it --network host i386/debian:trixie /bin/bash
apt update

Reason: ENOSYS from socket()

WORKING:
docker run -it --rm --network host --security-opt seccomp=unconfined i386/debian:trixie /bin/bash
apt update

docker run -it --rm --network host amd64/debian:trixie /bin/bash
apt update

EDIT:

  • Confirmed that reverting the sockcall() policy change fixes this issue on the i386 container.
  • Getting docker buildx to actually use a non-default seccom policy is an extreme hassle. Anyone hitting this is advised to just revert docker-ce to the previous version until a decent way to deal with this regression shows up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants