Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: opencontainers/runc
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.1.14
Choose a base ref
...
head repository: opencontainers/runc
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v1.1.15
Choose a head ref
  • 13 commits
  • 7 files changed
  • 5 contributors

Commits on Sep 3, 2024

  1. VERSION: back to development

    Signed-off-by: Aleksa Sarai <[email protected]>
    cyphar committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    2655e7c View commit details
    Browse the repository at this point in the history
  2. [1.1] libct/seccomp/patchbpf: rm duplicated code

    (This is a cherry-pick of 2cd05e4.)
    
    In findLastSyscalls, we convert libseccomp.ArchNative to the real
    libseccomp architecture, but archToNative already does that, so
    this code is redundant.
    
    Remove the redundant code, and move its comment to archToNative.
    
    Fixes: 7a8d716 ("seccomp: prepend -ENOSYS stub to all filters")
    Signed-off-by: Kir Kolyshkin <[email protected]>
    Signed-off-by: Aleksa Sarai <[email protected]>
    kolyshkin authored and cyphar committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    6223a65 View commit details
    Browse the repository at this point in the history
  3. [1.1] seccomp: patchbpf: rename nativeArch -> linuxAuditArch

    (This is a backport of b288abe.)
    
    Calling the Linux AUDIT_* architecture constants "native" leads to
    confusing code when we are getting the actual native architecture of the
    running system.
    
    Signed-off-by: Aleksa Sarai <[email protected]>
    cyphar committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    d85b583 View commit details
    Browse the repository at this point in the history
  4. [1.1] seccomp: patchbpf: always include native architecture in stub

    (This is a backport of ccc500c.)
    
    It turns out that on ppc64le (at least), Docker doesn't include any
    architectures in the list of allowed architectures. libseccomp
    interprets this as "just include the default architecture" but patchbpf
    would return a no-op ENOSYS stub, which would lead to the exact issues
    that commit 7a8d716 ("seccomp: prepend -ENOSYS stub to all
    filters") fixed for other architectures.
    
    So, just always include the running architecture in the list. There's
    no real downside.
    
    Ref: https://bugzilla.suse.com/show_bug.cgi?id=1192051#c6
    Fixes: 7a8d716 ("seccomp: prepend -ENOSYS stub to all filters")
    Reported-by: Fabian Vogt <[email protected]>
    Signed-off-by: Aleksa Sarai <[email protected]>
    cyphar committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    618e149 View commit details
    Browse the repository at this point in the history
  5. [1.1] nsenter: cloned_binary: remove bindfd logic entirely

    (This is a cherry-pick of b999376.)
    
    While the ro-bind-mount trick did eliminate the memory overhead of
    copying the runc binary for each "runc init" invocation, on machines
    with very significant container churn, creating a temporary mount
    namespace on every container invocation can trigger severe lock
    contention on namespace_sem that makes containers fail to spawn.
    
    The only reason we added bindfd in commit 16612d7 ("nsenter:
    cloned_binary: try to ro-bind /proc/self/exe before copying") was due to
    a Kubernetes e2e test failure where they had a ridiculously small memory
    limit. It seems incredibly unlikely that real workloads are running
    without 10MB to spare for the very short time that runc is interacting
    with the container.
    
    In addition, since the original cloned_binary implementation, cgroupv2
    is now almost universally used on modern systems. Unlike cgroupv1, the
    cgroupv2 memcg implementation does not migrate memory usage when
    processes change cgroups (even cgroupv1 only did this if you had
    memory.move_charge_at_immigrate enabled). In addition, because we do the
    /proc/self/exe clone before synchronising the bootstrap data read, we
    are guaranteed to do the clone before "runc init" is moved into the
    container cgroup -- meaning that the memory used by the /proc/self/exe
    clone is charged against the root cgroup, and thus container workloads
    should not be affected at all with memfd cloning.
    
    The long-term fix for this problem is to block the /proc/self/exe
    re-opening attack entirely in-kernel, which is something I'm working
    on[1]. Though it should also be noted that because the memfd is
    completely separate to the host binary, even attacks like Dirty COW
    against the runc binary can be defended against with the memfd approach.
    Of course, once we have in-kernel protection against the /proc/self/exe
    re-opening attack, we won't have that protection anymore...
    
    [1]: https://lwn.net/Articles/934460/
    
    Signed-off-by: Aleksa Sarai <[email protected]>
    cyphar committed Sep 3, 2024
    Configuration menu
    Copy the full SHA
    614ce12 View commit details
    Browse the repository at this point in the history
  6. Merge pull request #4392 from cyphar/1.1-remove-bindfd

    [1.1] nsenter: cloned_binary: remove bindfd logic entirely
    kolyshkin authored Sep 3, 2024
    Configuration menu
    Copy the full SHA
    bd671b6 View commit details
    Browse the repository at this point in the history

Commits on Sep 10, 2024

  1. merge #4391 into opencontainers/runc:release-1.1

    Aleksa Sarai (2):
      [1.1] seccomp: patchbpf: always include native architecture in stub
      [1.1] seccomp: patchbpf: rename nativeArch -> linuxAuditArch
    
    Kir Kolyshkin (1):
      [1.1] libct/seccomp/patchbpf: rm duplicated code
    
    LGTMs: kolyshkin rata
    cyphar committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    3216d3b View commit details
    Browse the repository at this point in the history

Commits on Oct 2, 2024

  1. increase memory.max in cgroups.bats

    Signed-off-by: lifubang <[email protected]>
    (cherry picked from commit 65a1074)
    lifubang authored and rata committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    719e2bc View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. Merge pull request #4423 from rata/1-1-fix-CI

    [1.1]: Increase memory.max in cgroups.bats
    AkihiroSuda authored Oct 3, 2024
    Configuration menu
    Copy the full SHA
    a4cebd3 View commit details
    Browse the repository at this point in the history
  2. [1.1] runc run: fix mount leak

    When preparing to mount container root, we need to make its parent mount
    private (i.e. disable propagation), otherwise the new in-container
    mounts are leaked to the host.
    
    To find a parent mount, we use to read mountinfo and find the longest
    entry which can be a parent of the container root directory.
    
    Unfortunately, due to kernel bug in all Linux kernels older than v5.8
    (see [1], [2]), sometimes mountinfo can't be read in its entirety. In
    this case, getParentMount may occasionally return a wrong parent mount.
    
    As a result, we do not change the mount propagation to private, and
    container mounts are leaked.
    
    Alas, we can not fix the kernel, and reading mountinfo a few times to
    ensure its consistency (like it's done in, say, Kubernetes) does not
    look like a good solution for performance reasons.
    
    Fortunately, we don't need mountinfo. Let's just traverse the directory
    tree, trying to remount it private until we find a mount point (any
    error other than EINVAL means we just found it).
    
    Fixes issue 2404.
    
    [1]: https://github.com/kolyshkin/procfs-test
    [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f6c61f96f2d97cbb5f
    Signed-off-by: Kir Kolyshkin <[email protected]>
    (cherry picked from commit 13a6f56)
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    65aa700 View commit details
    Browse the repository at this point in the history

Commits on Oct 4, 2024

  1. Merge pull request #4425 from kolyshkin/1.1-fix-mount-leak

    [1.1] runc run: fix mount leak
    AkihiroSuda authored Oct 4, 2024
    Configuration menu
    Copy the full SHA
    ed38aea View commit details
    Browse the repository at this point in the history
  2. CHANGELOG: Remove empty changed line

    No entry lives under this line, let's just remove it.
    
    Signed-off-by: Rodrigo Campos <[email protected]>
    rata authored and kolyshkin committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    2790485 View commit details
    Browse the repository at this point in the history
  3. VERSION: release 1.1.15

    [@kolyshkin: rebased; added a CVE link; added 1.1.15 link; changed date to 7 Oct]
    
    Signed-off-by: Rodrigo Campos <[email protected]>
    Signed-off-by: Kir Kolyshkin <[email protected]>
    rata authored and kolyshkin committed Oct 4, 2024
    Configuration menu
    Copy the full SHA
    bc20cb4 View commit details
    Browse the repository at this point in the history
Loading