Skip to content

Add containers streaming API#2347

Merged
crosbymichael merged 1 commit intocontainerd:masterfrom
crosbymichael:streamingapis
Jul 26, 2018
Merged

Add containers streaming API#2347
crosbymichael merged 1 commit intocontainerd:masterfrom
crosbymichael:streamingapis

Conversation

@crosbymichael
Copy link
Copy Markdown
Member

This adds streaming functionality via GRPC to not overload the message size limit. The containers.Store interface needs work because I'm not happy with it but help is needed. We need to make sure the client works will older daemons so we cannot replace the default implementation easily so we we will have figureout how we want to handle this client list command.

Fixes #2320

Signed-off-by: Michael Crosby [email protected]

@crosbymichael crosbymichael added this to the 1.2 milestone May 17, 2018
@codecov-io
Copy link
Copy Markdown

codecov-io commented May 17, 2018

Codecov Report

Merging #2347 into master will increase coverage by 4.37%.
The diff coverage is 7.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2347      +/-   ##
==========================================
+ Coverage   44.76%   49.13%   +4.37%     
==========================================
  Files          93       84       -9     
  Lines        9555     7449    -2106     
==========================================
- Hits         4277     3660     -617     
+ Misses       4585     3114    -1471     
+ Partials      693      675      -18
Flag Coverage Δ
#linux 49.13% <7.14%> (+0.14%) ⬆️
#windows ?
Impacted Files Coverage Δ
errdefs/grpc.go 70.14% <0%> (-4.86%) ⬇️
metadata/containers.go 49.8% <0%> (+1.82%) ⬆️
errdefs/errors.go 100% <100%> (ø) ⬆️
remotes/docker/fetcher.go 49.01% <0%> (-5.33%) ⬇️
cio/io_unix.go 70.65% <0%> (-4.07%) ⬇️
remotes/docker/auth.go 63.82% <0%> (-3.97%) ⬇️
remotes/docker/status.go 21.42% <0%> (-3.58%) ⬇️
platforms/cpuinfo.go 4.65% <0%> (-1.01%) ⬇️
log/context.go 36.84% <0%> (-0.66%) ⬇️
namespaces/context.go 55% <0%> (-0.56%) ⬇️
... and 66 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d02728f...400f16f. Read the comment docs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to ListStreamMessage as response type

@Random-Liu
Copy link
Copy Markdown
Member

Random-Liu commented May 18, 2018

The api seems cleaner if we can change container.List to return a stream instead of adding another function ListStream, but that will be a breaking api change. :( So it seems easier to just add a new function as this PR does.

And for cri, we don't have this problem, because we removed grpc between the plugin and containerd. We can keep using the original List function.

@crosbymichael
Copy link
Copy Markdown
Member Author

I wonder if we could make the call and if we get a not found, we fallback to the List implementation?

@crosbymichael crosbymichael force-pushed the streamingapis branch 2 times, most recently from 2dbcfd5 to 3ad1e87 Compare May 21, 2018 15:46
@crosbymichael
Copy link
Copy Markdown
Member Author

Ok, pushed an update that cleaned this up more. Now the method name on the Store is just Stream and it returns channels for the results and errors, it's modeled after the events subscribe. The client method is ContainersStream and has the same channel output

@dmcgowan
Copy link
Copy Markdown
Member

What is the advantage of the streaming approach vs adding limit/offset to the request? My primary concern is that we will need to do the same change for content, snapshotters, and anything else which returns a list. This will establish the API pattern of how we return these lists.

@crosbymichael
Copy link
Copy Markdown
Member Author

@dmcgowan ya I know. I think streaming is a little easier to handle state server side than managing when a request has already seen what containers/etc. Because containers, snapshotters, etc can come and go, with an offset, how do you manage that between requests? We still have some of the problem with streaming where it can return a container that was later removed but we can atleast say that the stream returns all valid containers at the time the stream was initiated.

Now that I think of it more, it would almost be better that some of the list commands support just returning the ids and when you need to interact with something you have to do a Get on that id which returns if it exists or not at the time you start doing actions on it. I'm not sure though.

@mcluseau
Copy link
Copy Markdown

I think that just returning the IDs and doing thousands of get after is not the approach I like the most. I didn't took a look at the details of the container structure but, having seen a 1GB response for 5k containers (200kb per container), surely there could be an "in between" approach with a stripped down ContainerListItem, avoiding massive gets in 90%+ of the use cases.

The streaming approach as you (@crosbymichael) describe is exactly what I intuitively assumed from the name ListStream (or maybe ListAsStream). If the name is just Stream, my intuition will tell me that I ask for a long running query, following the changes (new, changed, removed) forever. Also, I think the streaming approach has the advandage of being more developer friendly (you just iterate on the stream), while limit/offset doesn't really provides a safety net against the response size limit: people will just assume safe values like 500 but if list items grow for any reason in the future, it will start to blow up everywhere.

@crosbymichael
Copy link
Copy Markdown
Member Author

@mcluseau actually if we remove the Spec any type from the container it will probably reduce the response size greatly as in the general usecase, the majority of the data is on the spec. In the kube case, I think they add a lot of labels/extensions with pod info so it may not be a as major but much less than what it is today.

This is another option but we would still have a limit based on number of containers at some point, even if that point is 10k-100k containers.

@mcluseau
Copy link
Copy Markdown

@crosbymichael I like the ListStream approach, reducing the response size is more at the optimization level (a non-primary goal; if the list can be done the manager can clean the containers). If container IDs are UUID, 16 bytes, then the limit is already 1M containers with the current default limit. And even if we consider a limit of 10k, we're talking about 10k gets as soon as we want the name or the image the container is running. We may then introduce a batch get but that's kind of coming back to the issue of "how many is safe". Streaming is, I think, the safest and simplest solution.

@dmcgowan dmcgowan requested a review from stevvooe May 21, 2018 21:53
@Random-Liu
Copy link
Copy Markdown
Member

Random-Liu commented May 23, 2018

@crosbymichael Just to confirm. This won't break backward compatibility, right?

Docker CE 17.11 will still work with containerd 1.2, right? Just 1.2 containerd client won't work with older containerd version.

@crosbymichael
Copy link
Copy Markdown
Member Author

@Random-Liu it should not break compat. But I will also verify

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Random-Liu
Copy link
Copy Markdown
Member

@crosbymichael Are we going to use this in Docker? Do we need to handle unimplemented error in Docker? What is the version guarantee between Docker and containerd, given that containerd api version is not changed.

@crosbymichael
Copy link
Copy Markdown
Member Author

I'll update this PR this week to handle fallback.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do a repeated here so that the server can block messages efficiently.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's hard to tell the size of the serialized proto so multiple will just introduce the same issues

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ListContainer**s**Message

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only returning one still.

Comment thread containers/containers.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably define the error semantics for per container, as well as stream shutdown.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking, from our discussions, we will just hide this behind the List method. We never really wanted it to be a constant stream of containers, just an efficient way to return many containers with grpc. If I hide it behind the List method then we will just do the fallback if the stream rpc does not exist.

Comment thread containers/containers.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ListStream to match service in case we add new streaming methods?

@stevvooe
Copy link
Copy Markdown
Member

Looks mostly good. I have a few nits. Hope I didn't re-hash old concerns.

@crosbymichael crosbymichael force-pushed the streamingapis branch 2 times, most recently from 949dcc9 to 7c1d0ce Compare July 25, 2018 18:25
@crosbymichael
Copy link
Copy Markdown
Member Author

Ok, this is ready to be reviewed again. The implementation is done by default in the existing List methods and it will automatically fallback if a client is talking to an older daemon without the streaming api

Comment thread containerstore.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to allow this to return a partial list? Normally this is a cancellation and return nil, ctx.Err()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can update

@dmcgowan
Copy link
Copy Markdown
Member

LGTM

@crosbymichael
Copy link
Copy Markdown
Member Author

@stevvooe @Random-Liu @estesp can you review again?

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor spelling typo; otherwise LGTM

Comment thread containerstore.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor spelling: available

Signed-off-by: Michael Crosby <[email protected]>
@crosbymichael
Copy link
Copy Markdown
Member Author

@estesp updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants