-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Provenance step 1 - Transform images for validation and verification #8093
Comments
Very good proposal. Adding more trust to Docker will become more and more important. Since I'm not an expert on signing/validation, I've got some questions (excuse me if they're silly);
One thing that still worries me is the "chain of trust" in the automated builds. Although the maintainer can be trusted, many Dockerfiles make use of external libraries / source files that are downloaded during build and could have been tampered with. I realise this is outside the scope of this proposal, but could be something to give some thought in the future. |
@thaJeztah provenance is as alluded to in the title going to be multiple steps and phases. The chain of trust, manifest, signature, and v2 registry are the core components being baked in now that will allow for further development of features such as what you mentioned about multiple certificates and requiring different types of certification of an image. The v2 registry api should allow serving 'static' files like you mentioned, but is this referring a registry that is pull only or just a file system implementation? The initial "chain of trust" is about verifying an image is built by someone who has permission to build images of that name. Likewise future signatures could be added in future proposal to attest to other aspects of an image. Tracking all the sources which makes up a Docker build is being thought about, and content addressable layers is the first step toward that. |
Basically, a simple read-only repository hosted using apache or nginx. I've seen this was possible with the current (v1) and was wondering if this was still possible after this change. Thanks for explaining the "roadmap" for future additions on this matter. Thank you and the other contributors for taking security and trust at heart. LVGTM 😄 |
I don't think it is possible with v1 - but we are aiming for it with v2 eventually. |
@dmp42 Just for reference; this is what I was talking about; #4607 and https://github.com/vbatts/d2r However, let's not delve into static registries too much as it is only slightly related to this proposal and would only clutter up this discussion. 😺 |
@thaJeztah static/read-only is expected to stay possible. |
wait, why the embedded signatures? why not just sign the whole file, |
also, won't signing the whitespace too break in really fun ways? ie, space vs tab, |
does this also mean that image signatures can only be certificate-based and not leverage other existing systems of signing and trust like GPG that are already widely deployed and used? |
@tianon yes signing whitespace does allow for breaking in fun ways. The JWS spec gives us some protection against this by encoding the payload as base64. The "Pretty" version of it is both legible (not base64 encoded) and verifiable, but at the expense of it being fragile. I don't know if it is finalized which format will be used for the output, it is important for us to have a single file which includes the signature. |
We are using the JWS format for signatures, these signatures allow for x509 chain of trust verification or verification based solely off the public key. The initial verification of the namespace based on the signature will only use the public key, just like GPG-based trust. |
Not directly related, just a (possibly) interesting read: Announcing Keyless SSL (and the discussion on HackerNews |
So then shouldn't the signed version be completely extraneous whitespace-free? |
ie, |
I'm just reading through this and getting the feeling that it's really quite limiting. Maybe each of those "signature" blocks could have a "type" field to make sure there's space for future growth? (like maybe some kind of GPG backend too) |
As for whitespace, we are offering a format which is whitespace safe, but it is not preferred to use since it is not legible, just as formatting JSON without whitespace is illegible. I don't see a reason to assume that a signed file should be able to have any of its bytes changed (whitespace or not) and still be expected to pass signature validation. I agree with a type field and I think we are still trying to figure out how multiple signatures will be handled. The GPG backend I think would come in later at verification. Right now we are thinking of libtrust more as a possible backend for GPG, rather than GPG a possible backend for libtrust. |
@thaJeztah it is an interesting read and although no directly related, there seems to be some similar motivations. The approach used in this proposal is very sensitive to the idea that builders should both be identified by their public key and solely responsible for their private key. At no point does the private key need to be shared with anyone else to build and verify builds. |
On Sep 18, 2014 5:21 PM, "Tianon Gravi" notifications@github.com wrote:
A type field trip s already in the works. I'll update the proposal. The |
Also, there will be a directory like |
@dmcgowan @shykes I realize that one of the objectives is to have the docker binary be all inclusive, but including anything like a root CA in the binary is not going to be acceptable except for demos. A production environment will manage the distribution of trusted CAs land them on disk for the daemon to use. |
@vbatts about this part:
Assuming "the content" of a layer is defined by the diff with its parent, do you plan to store attribute change (i.e: modification datetime) separately from actual content changes? So that (Happy to open or comment on another issue if that's outside of the scope of this proposal). |
On Sep 19, 2014 5:15 PM, "Johan Euphrosine" notifications@github.com
I intentionally left that a little vague. For the foreseeable future this
We've been giving the tarsum calculation a lot of attention lately, so |
@vbatts I think it makes sense to have a directory for root CAs. These CAs could either be checked by default, turned on/off individually, or a flag to use them all. The CA we talked about bundling I would not categorize as the same type of CA, since this root CA is used to enforce the global namespace, so it shouldn't differ between installations. Certain operations may be able to turn off enforcement or allow a work around (a local namespace), but the bundled CA should not be replaced with a different CA. How do you propose bundling the root CA if not in the binary, since the binary is the primary method of bundling and installing? |
Oh @vbatts you gem. I'm -1 on CAs in general, but I see how that could be attractive. As long as there's an alternative to the rat's nest that x509 is, I'll be happy with this and might actually use it. If I can say "only allow my Docker engine to run image manifests that are signed/built/acked-by XYZ person (specified via some fully unambiguous and easily verified means like a full GPG fingerprint)", that'd be amazing. I also see an absolute need to be able to disable the default CA, even if it can't be changed in any other way. |
@tianon the default root CA in this proposal is used is to verify that the builder has access to namespace of the image. The identity of the builder will be derived from the signature, which is created by the user's private key, just like GPG signatures. The verification is done on the fingerprint of the public key. Statements signed by a namespace authority and chaining back to the root will be able to be downloaded/imported which provide identity of public keys to user and users to namespace. Without these statements chaining back to a single root CA the trust graph will be unmanageable (like of web of trust) and difficult to tie in to the existing namespace. While x509 is not ideal and was not our first choice, it is a proven mechanism for extending authority in a secure and manageable way. |
It's probably better to use a SHA256 of the layer file as it's stored (the "payload" hash in Docker client parlance) rather than the TarSum. Two reasons: (2) The storage layer (be it S3, GCS, or local) shouldn't need special insight into files in order to hash them. |
From [1]: > For reference, the relevant manifest fields for the registry are the > following: > > field description > name The name of the image. > tag The tag for this version of the image. > fsLayers A list of layer descriptors (including tarsum) > signature A JWS used to verify the manifest content > > For more information about the manifest format, please see > moby#8093. And from [2]: > Image Manifest > > The image manifest file will contain all the information which is > needed to pull, install, validate and run an image. It will contain > a list of layers by a content addressable id, history, run time > configuration, and signatures. This manifest is generated by the > daemon. Initially this generation will happen when an image is > published, and ultimately happen anytime an image is built or > committed. Each manifest is required to be signed by the client > creating the manifest on push or build with additional signatures > which can be added post build to verify the quality of the manifest > or validity of the builder. The history will contain fully backward > compatible metadata to allow old style layer and metadata to be > recreated from the manifest. > > Signable manifest (or payload) refers to portions of the manifest > which will be signed by builder. The signable manifest is a JSON > dictionary containing the layers, run configuration, and > history. The entire signable manifest will signed, any changes > including whitespace will require a new signature. To aid in > readability the signable manifest should be well-formatted JSON. > > Example: (totally subject to change) > > { > "name": "dmcgowan/test-image", > "tag": "latest", > "architecture": "amd64", > "fsLayers": [ > { > "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", > }, > { > "blobSum": "tarsum+sha256:cea0d2071b01b0a79aa4a05ea56ab6fdf3fafa03369d9f4eea8d46ea33c43e5f", > }, > { > "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", > }, > { > "blobSum": "tarsum+sha256:2a7812e636235448785062100bb9103096aa6655a8f6bb9ac9b13fe8290f66df" > } > ], > "history": [ > "{\"id\":\"a9eb172552348a9a49180694790b33a1097f546456d041b6e82e4d7716ddb721\",\"parent\":\"120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16\",\"created\":\"2014-06-05T00:05:35.990887725Z\"...", > "{\"id\":\"120e218dd395ec314e7b6249f39d2853911b3d6def6ea164ae05722649f34b16\",\"parent\":\"42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229\",\"created\":\"2014-06-05T00:05:35.692528634Z\"...", > "{\"id\":\"42eed7f1bf2ac3f1610c5e616d2ab1ee9c7290234240388d6297bc0f32c34229\",\"parent\":\"511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158\",\"created\":\"2014-06-05T00:05:35.589531476Z\"...", > "{\"id\":\"511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158\",\"comment\":\"Imported from -\",\"created\":\"2013-06-13T14:03:50.821769-07:00\"..." > ], > "schemaVersion": 1 > } > > Signed manifest refers to a manifest which includes the signature as > well as the signable manifest. The signed manifest could be > represented as either a JSON Web Signature (JSON serialization, see > link), in which the payload is the base64 encoded signed manifest, > or an altered version of the signed manifest JSON to include the > signature as the last element of the JSON dictionary and a record of > the alternations to the original signed manifest included in the > signature. EIther format is fully verifiable and tamper-proof. > > http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-7.2 > > Example: (human readable format) > > { > "name": "dmcgowan/test-image", > "tag": "latest", > "architecture": "amd64", > "blobSums": [ > { > "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", > }, > { > "blobSum": "tarsum+sha256:cea0d2071b01b0a79aa4a05ea56ab6fdf3fafa03369d9f4eea8d46ea33c43e5f", > }, > { > "blobSum": "tarsum+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855", > }, > { > "blobSum": "tarsum+sha256:2a7812e636235448785062100bb9103096aa6655a8f6bb9ac9b13fe8290f66df" > } > ], > "history": ["v1 compatible string encoded json for each layer"], > "schemaVersion": 1, > "signatures": [ > { > "header": { > "jwk": { > "crv": "P-256", > "kid": "LYRA:YAG2:QQKS:376F:QQXY:3UNK:SXH7:K6ES:Y5AU:XUN5:ZLVY:KBYL", > "kty": "EC", > "x": "Cu_UyxwLgHzE9rvlYSmvVdqYCXY42E9eNhBb0xNv0SQ", > "y": "zUsjWJkeKQ5tv7S-hl1Tg71cd-CqnrtiiLxSi6N_yc8" > }, > "alg": "ES256" > }, > "signature": "m3bgdBXZYRQ4ssAbrgj8Kjl7GNgrKQvmCSY-00yzQosKi-8UBrIRrn3Iu5alj82B6u_jNrkGCjEx3TxrfT1rig", > "protected": "eyJmb3JtYXRMZW5ndGgiOjYwNjMsImZvcm1hdFRhaWwiOiJDbjAiLCJ0aW1lIjoiMjAxNC0wOS0xMVQxNzoxNDozMFoifQ" > } > ] > } I didn't delve too deeply into the JWS spec [3], just enough to get: > The following members are defined for use in top-level JSON objects > used for the JWS JSON Serialization: > > payload > The "payload" member MUST be present and contain the value > BASE64URL(JWS Payload). > signatures > The "signatures" member value MUST be an array of JSON objects. > Each object represents a signature or MAC over the JWS Payload and > the JWS Protected Header. > > The following members are defined for use in the JSON objects that > are elements of the "signatures" array: > > protected > The "protected" member MUST be present and contain the value > BASE64URL(UTF8(JWS Protected Header)) when the JWS Protected > Header value is non-empty; otherwise, it MUST be absent. These > Header Parameter values are integrity protected. > header > The "header" member MUST be present and contain the value JWS > Unprotected Header when the JWS Unprotected Header value is non- > empty; otherwise, it MUST be absent. This value is represented as > an unencoded JSON object, rather than as a string. These Header > Parameter values are not integrity protected. > signature > The "signature" member MUST be present and contain the value > BASE64URL(JWS Signature). > > At least one of the "protected" and "header" members MUST be present > for each signature/MAC computation so that an "alg" Header Parameter > value is conveyed. There are some inconsistencies in the Docker issues that I've arbitrated: * [1] has 'signature', but [2] has 'signatures'. I expect this was just a typo in [1], since having an array of signatures is mentioned multiple times in [2], and nobody suggests having only a single signature. (Stephen confirmed this [4]). * [2] has 'fsLayers' holding 'blobSum's in the signable manifest, but 'blobSums' holding 'blobSum's in the signed manifest. I expect they meant 'fsLayers' in both cases, because 'blobSums' is too generic, and [1] mentions 'fsLayers'. I've left the other manifest fields ('architecture', 'history', 'schemaVersion', ...) unspecified, since that's all independent of the registry API. [1]: moby#9015 [2]: moby#8093 [3]: http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-7.2 [4]: f083eb1#commitcomment-8922939
Since this issue was initially opened there have been huge strides made in a
This provisional image manifest allows for a more abstracted implementation and I am closing the step 1 issue as complete. Further work will continue in the |
Hi:
|
@xiekeyang i noticed you asked the same question in distribution/distribution#1091 (comment). The GitHub issue trackers are not a support forum, Could you ask your question on;
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment was marked as spam.
This comment was marked as spam.
-u o0.xiTm |
Background
The current image format does not allow for content addressable images nor
require metadata about the image to reference the content of the layer used by
the metadata. The identifiers for layers are randomly generated requiring
extra book keeping to map layer ids to the content being referred to. In a
highly distributed environment such as the Docker ecosystem, this book keeping
is cumbersome and complicates security.
This relates to #6805 #6959
Proposal Summary
Make images self-describing manifests containing a list of content addressable
layers, run configuration, and a signatures to identify the builder and verify
the image meets the expectations of the installer.
Image Manifest
The image manifest file will contain all the information which is needed to
pull, install, validate and run an image. It will contain a list of layers by
a content addressable id, history, run time configuration, and signatures.
This manifest is generated by the daemon. Initially this generation will happen
when an image is published, and ultimately happen anytime an image is built or
committed. Each manifest is required to be signed by the client creating the
manifest on push or build with additional signatures which can be added post
build to verify the quality of the manifest or validity of the builder. The
history will contain fully backward compatible metadata to allow old style
layer and metadata to be recreated from the manifest.
Signable manifest (or payload) refers to portions of the manifest which
will be signed by builder. The signable manifest is a JSON dictionary
containing the layers, run configuration, and history. The entire signable
manifest will signed, any changes including whitespace will require a new
signature. To aid in readability the signable manifest should be
well-formatted JSON.
Example: (totally subject to change)
Signed manifest refers to a manifest which includes the signature as well
as the signable manifest. The signed manifest could be represented as either a
JSON Web Signature (JSON serialization, see link), in which the payload is the
base64 encoded signed manifest, or an altered version of the signed manifest
JSON to include the signature as the last element of the JSON dictionary and a
record of the alternations to the original signed manifest included in the
signature. EIther format is fully verifiable and tamper-proof.
http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-7.2
Example: (human readable format)
Content Addressable Layers
Each layer of an image will be referenced by a checksum created from its
contents. This checksum will be used on push and pull to verify contents have
not been tampered and disallow the layer referred to in the manifest to be
changed after signed.
History
For auditability and assurance of the image, there will be a
history
section.This history will convey the life of the the image (build steps, ancestry,
prior attestations on parent images, etc.).
It will have a generic form, and it is important to note that its content is
included in the signed payload.
Signature
Every client and daemon will contain both a public key pair which can be used
to sign manifests. So that the user on the host that publishes (or builds) the
image can sign the image manifest, without sharing their keys for all users on
the host.
Verification
Note: Verification framework will be vetted out in a separate Proposal
review, but the following is provided for a complete picture of its role.
Verification of a manifest will be done by checking the public key used to sign
the image manifest against an authorization graph linking keys to users and the
image namespaces. The authority for this graph will be a remote server which
can respond to authorization queries with signed statements which can be cached
and imported locally for future authorizations. These signed statements which
are received and cached contain a chain of trust which verify their
authenticity. A root certificate to verify this chain will ship with Docker to
allow immediate verification of these statements. Certificates are X509
certificates and verification uses an x509 chain included with the signed
statement. The signed statement will be a JSON Web Signature with the contents
a series of graph nodes and edges to be imported and the x509 chain in the
signature header.
http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-31#section-4.1.6
Registry V2 API
(Note: this versioning is on the API of talking to a docker registry)
Unlike the present
/v1/...
which is locked to the root of the URI path, the/v2/...
is expected be the relative root of the path, for easier routehandling.
Manage manifest by tag
GET/PUT/DELETE /v2/images/<imgname>/<tagname>
List tags
GET /v2/images/<imgname>/tags
Download an image layer by content id
GET /v2/images/<imgname>/<sumtype>/<sum>
Upload an image layer
PUT /v2/images/<imgname>/<sumtype>/<sum>
Upload an image layer
PUT /v2/images/<imgname>/<sumtype>
Compatibility
For compatibility with prior versions of docker-registries and docker daemons,
the manifest will store the json metadata used in previous versions in the
history
section. This history will allow recreation of the layers in theprevious format and layout. Version 2 registries can synchronize content with
version 1 registries using this content in order to ensure content is still
accessible through the version 1 API.
There will be a couple of phases.
Phase 1
Have v2 capable registry and docker daemon that can:
Phase 2
Phase 3
Strictly V2
In the future when signatures are enforced more strictly, it will become more
difficult to do this synchronization as version 1 will not validate signatures
and creation version 2 manifests from version 1 registries will not have a
signature of the builder.
Attribution
Folks involved in this design so far
@dmcgowan @dmp42 @jlhawn @shykes and @vbatts
The text was updated successfully, but these errors were encountered: