-
Notifications
You must be signed in to change notification settings - Fork 238
Description
background
It is possible to upload a blob in a single POST, per the spec, called a monolithic upload.
It is also possible to upload a blob via POST/PUT, which is also called a monolithic upload.
I'm talking about the first one. I call this "single request monolithic upload", which may be abbreviated as "SRMU" in the rest of this proposal.
I've put specific, actionable proposals in bold.
I can send a PR that implements my proposal, but I'd like to solicit some feedback first to see if anyone feels strongly opposed.
proposal
- We should remove SRMUs from the spec. As with catalog, the lack of broad support and high cost make it unfit for the spec.
- We should clarify the ambiguity between 201 and 202 response codes, whether we deprecate it or not.
(Optional aside (3): We should have a deprecation document. At one point we deprecated catalog, but I can't seem to find reference to it anywhere after #178. It can be helpful to clearly document the differences, as in opencontainers/image-spec#817)
rationale
ambiguity
This API is not well defined, so it is unclear how clients should interpret the registry's response.
For cross-repo mounting, if the mount failed for any reason, the registry will return a 202 that indicates you should proceed with the upload as usual. If the mount succeeds, the registry will return a 201, and clients can skip the upload.
For SRMUs, we don't have such a distinction... from the current spec at HEAD:
Successful completion of the request MUST return either a 201 Created or a 202 Accepted
See #230 (comment), where a test enforcing the spec (202 is allowed) would pass for a registry even though it doesn't support it.
At the very least, we should update the spec to say that clients should interpret 201 as "upload succeeded" and 202 as "proceed with upload at Location"
This has the nice failure mode for registries that don't support SRMUs. If a registry doesn't bother to check the request body or ?digest querystring parameter of POST requests, they will likely just return a 202 with a Location, as per a normal chunked upload. Clients that see this should probably assume that the registry does not support SRMU in this case, and proceed with a chunked upload or POST/PUT monolithic upload.
lack of support
From #211 (comment)
The POST/PATCH/PUT chunked uploading strategy is supported by every registry I've encountered (probably because that's what docker did), so we've modified our clients to only use this flow. Monolithic support may have caught up in the meantime, but as of a couple years ago, there were several smaller registries that did not support monolithic blob uploads.
I believe containerd uses the POST/PUT monolithic upload strategy, so that is probably also widely supported (though not as widely as chunked).
"A single POST request" should possibly be removed
In the original spec, this method of blob upload was only barely mentioned as an optional thing for the blob upload initiation:
Optionally, if the digest parameter is present, the request body will be used to complete the upload in a single request.
I worry that not a lot of registries support this.
Deprecation should not imply that registries that currently support this stop, so any clients that rely on a specific registry implementing this should still just work. I know of a single client that actually uses this upload mechanism, so there is very low cost to deprecating it. It's also a very easy fix, client-side, if a registry decided to stop supporting it.
I've just tried to perform this against Docker Hub, and it doesn't work. The upload POST response has a 202 status with a Location header, and the subsequent manifest PUT fails with MANIFEST_BLOB_UNKNOWN, presumably because the SRMU failed.
GCR does support this, but I'd rather it not.
(I have not tried other registries, feel free to pile on with support or lack of support on your favorite registry.)
cost
Also from #211 (comment)
This is going to be the most expensive blob upload method to implement for registries because there is no opportunity for the registry to offload the byte ingestion to a separate service (via the Location header, as in the other methods).
I'd argue for removing it entirely, but would be happy if there were some caveats around its use. Derek mentioned maybe we could add something about "optimizations" like this, maybe in the FAQ from vsoch. There's no discussion in the spec about why you would choose one of these methods over another, which could go in that FAQ. No matter what we decide, this should definitely not be the first upload method that readers encounter.
For some more discussion of why the SRMU is bad, see this great post from backblaze.
Specifically relevant:
The interface to upload data into Amazon S3 is actually a bit simpler than Backblaze B2’s API. But it comes at a literal cost. It requires Amazon to have a massive and expensive choke point in their network: load balancers. When a customer tries to upload to S3, she is given a single upload URL to use. For instance, http://s3.amazonaws.com/. This is great for the customer as she can just start pushing data to the URL. But that requires Amazon to be able to take that data and then, in a second step behind the scenes, find available storage space and then push that data to that available location. The second step creates a choke point as it requires having high bandwidth load balancers. That, in turn, carries a significant customer implication; load balancers cost significant money.
This is analogous to what we have in the registry spec. The registry is responsible for handling the blob ingress and cannot easily hand off that traffic to another service via the Location header.