Skip to content

Remove local fork of archive/tar package#43185

Merged
thaJeztah merged 2 commits intomoby:masterfrom
corhere:42402-safer-fileinfo
Feb 24, 2022
Merged

Remove local fork of archive/tar package#43185
thaJeztah merged 2 commits intomoby:masterfrom
corhere:42402-safer-fileinfo

Conversation

@corhere
Copy link
Contributor

@corhere corhere commented Jan 24, 2022

- What I did
Removed the local fork of archive/tar while retaining all existing mitigations of CVE-2019-14271. Fixes #42402

- How I did it
I hid the value containing the file's UID and GID from the tar.FileInfoHeader function so that it would elide calling user.LookupId, analogous to the proposal in golang/go#50102.

- How to verify it

  1. Comment out the other mitigation and the mitigation introduced in this PR and make dynbinary
  2. Build a container image which exploits CVE-2019-14271 (an exploit payload which prints a message to stderr and exits nonzero makes it easy to spot in dockerd's debug logs) and create a container from that image
  3. docker cp from the container and observe that the exploit code has been executed
  4. Uncomment the mitigation from this PR (leaving the other mitigation commented out), rebuild dockerd, and try to exploit the vulnerability again. Observe that the exploit payload does not execute.

- Description for the changelog
Remove local fork of archive/tar package

- A picture of a cute animal (not mandatory but encouraged)

@corhere corhere requested a review from tianon as a code owner January 24, 2022 23:04
@corhere
Copy link
Contributor Author

corhere commented Jan 25, 2022

I am also experimenting with having a separate binary for the chrootarchive reexecs as I described in #42402 (comment) which has both compile-time and runtime assertions to make really, really sure that it can only be built as a static binary without cgo. I have opened a draft PR for your consideration: #43186

@corhere corhere force-pushed the 42402-safer-fileinfo branch from 18f442c to 551d446 Compare January 25, 2022 18:22
@corhere corhere requested a review from AkihiroSuda January 25, 2022 21:15
@corhere corhere force-pushed the 42402-safer-fileinfo branch from 551d446 to 74fe7a2 Compare January 26, 2022 18:00
@thaJeztah
Copy link
Member

@tonistiigi @kolyshkin @cpuguy83 PTAL

@thaJeztah thaJeztah added kind/refactor PR's that refactor, or clean-up code status/2-code-review labels Feb 4, 2022
@corhere corhere force-pushed the 42402-safer-fileinfo branch from 74fe7a2 to b711063 Compare February 4, 2022 21:23
@corhere corhere force-pushed the 42402-safer-fileinfo branch from b711063 to bd97462 Compare February 7, 2022 16:02
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

In view of this, perhaps it might make sense to close golang/go#50102 since we can achieve a similar thing without too much effort. @corhere would you do the honors to describe your solution and propose to close that one?

@corhere
Copy link
Contributor Author

corhere commented Feb 8, 2022

In view of this, perhaps it might make sense to close golang/go#50102 since we can achieve a similar thing without too much effort.

I think the proposal should be left open. This PR implements the suggestion from golang/go#50102 (comment) and reimplements the loading of (some of) the other filesystem info in userspace, so to speak. The proposed tar.FileInfoNames interface would reduce the amount of moby code that needs to be maintained to support new platforms, and would be beneficial to others in a similar situation who also require having atime and ctime set in their tar file headers.

@tonistiigi
Copy link
Member

The problems with this is that it:

  • Makes assumptions about the stdlib FileInfoHeader implementation. That the current implementation has custom code for *tar.Header and statUnix implementation in stdlib returns early before it reaches the code that can invoke the libraries that do the loading of usernames. Potentially this can change without notice in stdlib.
  • We need to make sure that the new NoLookups function is used always. If one place forgets to call it (eg. after some code change) it breaks the security for the rest. This is also true for the proposal in archive/tar: add FileInfoNames interface golang/go#50102

If we are ok with these risks then the implementation SGTM

@corhere
Copy link
Contributor Author

corhere commented Feb 10, 2022

The problems with this is that it:

  • Makes assumptions about the stdlib FileInfoHeader implementation. That the current implementation has custom code for *tar.Header and statUnix implementation in stdlib returns early before it reaches the code that can invoke the libraries that do the loading of usernames. Potentially this can change without notice in stdlib.

The only assumption being made is that the stdlib tar.FileInfoHeader implementation loads the file's username and group name iff the file's UID and GID are known to it. The fs.FileInfo interface cannot be widened without breaking the Go 1 Compatibility Promise so it seems highly unlikely for the mechanism used to get the UID and GID from the FileInfo value to change without notice. That being said, I can guard against such an occurrence by adding a unit test which asserts that tar.FileInfoHeader(nosysFileInfo{fi}, "") returns a tar.Header with zero-valued Uname and Gname fields when fi is a value returned from os.Stat.

We need to make sure that the new function is always used in code executed inside a chroot. Or to be more precise, we need to make sure that dlopen() is never executed while the process is chrooted. Loading shared objects from the host's system library paths does not break security when the process's filesystem root is the host's filesystem root. And conversely, archive/tar is not the only package which potentially leads to dlopen() via glibc. Even with the existing solution of patching archive/tar, security could be broken by a call to e.g. user.Current() or net.Dial() while chrooted. Although, as I mentioned in the PR description, there is another layer of mitigation which prevents such oversights from actually breaking security.

@corhere corhere force-pushed the 42402-safer-fileinfo branch from bd97462 to ddc2000 Compare February 11, 2022 00:06
@thaJeztah
Copy link
Member

Looks like gosec linting is complaining. We had quite some "false positives" on that linter, but we should check. We updated the linter to a new version Yesterday; if G305 is not useful for us, we can either disable it globally or if "useful", but not for all cases, add some //nolint: gosec comments, e.g.

//nolint: gosec // Ignore G305: File traversal when extracting zip/tar archive. 
INFO [runner] linters took 1m25.662021744s with stages: goanalysis_metalinter: 1m25.57620577s
pkg/archive/archive.go:732:17: G305: File traversal when extracting zip/tar archive (gosec)
        targetPath := filepath.Join(extractDir, hdr.Linkname)
                      ^
pkg/archive/archive.go:744:17: G305: File traversal when extracting zip/tar archive (gosec)
        targetPath := filepath.Join(filepath.Dir(path), hdr.Linkname)
                      ^
pkg/archive/archive.go:1097:11: G305: File traversal when extracting zip/tar archive (gosec)
        path := filepath.Join(dest, hdr.Name)
                ^
pkg/archive/archive.go:1161:11: G305: File traversal when extracting zip/tar archive (gosec)
        path := filepath.Join(dest, hdr.Name)
                ^
pkg/archive/archive_linux.go:54:17: G305: File traversal when extracting zip/tar archive (gosec)
                Name:       filepath.Join(hdr.Name, WhiteoutOpaqueDir),
                            ^
pkg/archive/diff.go:116:11: G305: File traversal when extracting zip/tar archive (gosec)
        path := filepath.Join(dest, hdr.Name)
                ^
pkg/archive/diff.go:212:11: G305: File traversal when extracting zip/tar archive (gosec)
        path := filepath.Join(dest, hdr.Name)
                ^
pkg/tarsum/tarsum.go:249:36: G305: File traversal when extracting zip/tar archive (gosec)
            ts.currentFile = path.Join(".", path.Join("/", currentHeader.Name))
                                            ^

@thaJeztah
Copy link
Member

@corhere thanks again! We discussed this PR in Yesterday's maintainers meeting with @tonistiigi @samuelkarp @cpuguy83 @tianon and @crazy-max and from that discussion, I think we're good to move forward with this change.

Could you rebase the PR (and have a look at the linter failures? #43185 (comment))

I can guard against such an occurrence by adding a unit test which asserts that tar.FileInfoHeader(nosysFileInfo{fi}, "") returns a tar.Header with zero-valued Uname and Gname fields when fi is a value returned from os.Stat.

It probably doesn't hurt to have a test (just to be sure); that said (as mentioned) there may be other areas that could potentially do a dlopen() vis glib; I recall we had discussions about that at the time, and there was no "real" solution other than the separate binary, which we still could consider as a follow-up.

there is another layer of mitigation which prevents such oversights from actually breaking security.

I was actually trying to recall why we added back the vendored pkg/archive (after we added the other mitigation); I found #35739 (comment) on that topic, and some internal tickets, which may have been related to Docker Enterprise versions being on different go versions (not supporting the osusergo build tag) and/or FIPS-compliant builds of Docker Enterprise, but my memory is failing me.

Let me know if you need help!

I hope I didn't miss anything (if I did, can you add details, @tonistiigi ?)

@corhere
Copy link
Contributor Author

corhere commented Feb 18, 2022

Could you rebase the PR (and have a look at the linter failures? #43185 (comment))

On it. Taking a peek under the hood at the gosec linter, it indiscriminately flags any path.Join call with an argument derived from a tar.Header without checking if it is followed by a mitigation for path-traversal so lots of potential for false positives. I'll review the code to verify that all the usages are safe and annotate them with //#nosec suppressions as they are more auditable than //nolint suppressions.

It probably doesn't hurt to have a test (just to be sure)

Done.

I was actually trying to recall why we added back the vendored pkg/archive (after we added the other mitigation); I found #35739 (comment) on that topic, and some internal tickets, which may have been related to Docker Enterprise versions being on different go versions (not supporting the osusergo build tag) and/or FIPS-compliant builds of Docker Enterprise, but my memory is failing me.

Thankfully it was documented at the time! Tl;dr it was a workaround required specifically for static builds on Go 1.10 with CGO_ENABLED=1 which was made obsolete by Go 1.11 with the osusergo build tag. (Although in hindsight, nosysFileInfo might have been just as effective a workaround back then.) I'll follow up internally with the rest of the Mirantis Container Runtime team to verify that this change won't give us any trouble with FIPS-compliant builds, but otherwise it appears that the reasons for adding the back the vendored archive/tar have long reached their end of life.

A copy of Go's archive/tar packge was vendored with a patch applied to
mitigate CVE-2019-14271. Vendoring standard library packages is not
supported by Go in module-aware mode, which is getting in the way of
maintenance. A different approach to mitigate the vulnerability is
needed which does not involve vendoring parts of the standard library.

glibc implements name service lookups such as users, groups and DNS
using a scheme known as Name Service Switch. The services are
implemented as modules, shared libraries which glibc dynamically links
into the process the first time a function requiring the module is
called. This is the crux of the vulnerability: if a process linked
against glibc chroots, then calls one of the functions implemented with
NSS for the first time, glibc may load NSS modules out of the chrooted
filesystem.

The API underlying the `docker cp` command is implemented by forking a
new process which chroots into the container's rootfs and writes a tar
stream of files from the container over standard output. It utilizes the
Go standard library's archive/tar package to write the tar stream. It
makes use of the tar.FileInfoHeader function to construct a tar.Header
value from an fs.FileInfo value. In modern versions of Go on *nix
platforms, FileInfoHeader will attempt to resolve the file's UID and GID
to their respective user and group names by calling the os/user
functions LookupId and LookupGroupId. The cgo implementation of os/user
on *nix performs lookups by calling the corresponding libc functions. So
when linked against glibc, calls to tar.FileInfoHeader after the
process has chrooted into the container's rootfs can have the side
effect of loading NSS modules from the container! Without any
mitigations, a malicious container image author can trivially get
arbitrary code execution by leveraging this vulnerability and escape the
chroot (which is not a sandbox) into the host.

Mitigate the vulnerability without patching or forking archive/tar by
hiding the OS-dependent file info from tar.FileInfoHeader which it needs
to perform the lookups. Without that information available it falls back
to populating the tar.Header with only the information obtainable
directly from the FileInfo value without making any calls into os/user.

Fixes moby#42402

Signed-off-by: Cory Snider <[email protected]>
@thaJeztah
Copy link
Member

Thankfully it was documented at the time!

🤦 ah, good find! I didn't look there!

Although in hindsight, nosysFileInfo might have been just as effective a workaround back then

If my memory serves me well, the osusergo option was added before the archive/tar issues existed (as it already cases grievance when using os/user); only before, it would happen when it was used explicitly, and now it became an issue when used implicitly (through archive/tar)

@corhere corhere force-pushed the 42402-safer-fileinfo branch from ddc2000 to c98286a Compare February 18, 2022 20:40
The recently-upgraded gosec linter has a rule for archive extraction
code which may be vulnerable to directory traversal attacks, a.k.a. Zip
Slip. Gosec's detection is unfortunately prone to false positives,
however: it flags any filepath.Join call with an argument derived from a
tar.Header value, irrespective of whether the resultant path is used for
filesystem operations or if directory traversal attacks are guarded
against.

All of the lint errors reported by gosec appear to be false positives.

Signed-off-by: Cory Snider <[email protected]>
@corhere corhere force-pushed the 42402-safer-fileinfo branch from c98286a to 833139f Compare February 18, 2022 20:42
@thaJeztah thaJeztah added this to the 21.xx milestone Feb 18, 2022
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for updating!

@thaJeztah
Copy link
Member

@tonistiigi @cpuguy83 PTAL

@thaJeztah
Copy link
Member

I'll go ahead and bring this one in. I assume @tonistiigi is good with this (otherwise let me know)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove local fork of Golang's archive/tar

5 participants