Description
Reading a large blob causes hanging goroutines of type grpc.newClientStream in the client.
Detail
It appears that each time we call ReadAt(), it creates a new stream, which is not marked as ClientStreams: true. This, in turn, makes it mark lastSent: true immediately, and never closes the client stream.
Each "ReadAt" chunk is 1MB, so you end up with one lingering goroutine per 1MB of blob size. Pull a 1GB blob, get ~1000 goroutines.
Steps to reproduce the issue:
I am working on a narrow reproducibility case, which I will post here as soon as I have code for it.
Output of containerd --version:
Latest client API
Any other relevant information:
I believe the source of it is here
Streams: []grpc.StreamDesc{
{
StreamName: "List",
Handler: _Content_List_Handler,
ServerStreams: true,
},
{
StreamName: "Read",
Handler: _Content_Read_Handler,
ServerStreams: true,
},
{
StreamName: "Write",
Handler: _Content_Write_Handler,
ServerStreams: true,
ClientStreams: true,
},
},
Note that only the "Write" stream has ClientStreams set to true, while the others are not set (default to false).
When we call contentClient.Read here, we create a new stream, passing the desc of the "Read", which, as stated above, is defaulted to false:
func (c *contentClient) Read(ctx context.Context, in *ReadContentRequest, opts ...grpc.CallOption) (Content_ReadClient, error) {
stream, err := c.cc.NewStream(ctx, &_Content_serviceDesc.Streams[1], "/containerd.services.content.v1.Content/Read", opts...)
if err != nil {
return nil, err
}
x := &contentReadClient{stream}
if err := x.ClientStream.SendMsg(in); err != nil {
return nil, err
}
if err := x.ClientStream.CloseSend(); err != nil {
return nil, err
}
return x, nil
}
The really interesting part is when the above code calls SendMsg. Since ClientStreams is set to false, it immediately sets sentLast = true in grpc, see here:
if !cs.desc.ClientStreams {
cs.sentLast = true
}
Because of this, when we call CloseSend() in our code above, it gets ignored in grpc, see here
func (cs *clientStream) CloseSend() error {
if cs.sentLast {
// TODO: return an error and finish the stream instead, due to API misuse?
return nil
}
They even have a comment for it above: TODO: return an error and finish the stream instead, due to API misuse?
Unless I misunderstood, we are doing the API misuse they reference?
It is possible I completely misunderstood it, but I see no way how we can read data without creating those lingering zombie goroutines.
As soon as I can get a simple piece of code to reproduce it, I will post here. I can reproduce it, but only as part of a larger library, which will confuse.
Description
Reading a large blob causes hanging goroutines of type
grpc.newClientStreamin the client.Detail
It appears that each time we call
ReadAt(), it creates a new stream, which is not marked asClientStreams: true. This, in turn, makes it marklastSent: trueimmediately, and never closes the client stream.Each "ReadAt" chunk is 1MB, so you end up with one lingering goroutine per 1MB of blob size. Pull a 1GB blob, get ~1000 goroutines.
Steps to reproduce the issue:
I am working on a narrow reproducibility case, which I will post here as soon as I have code for it.
Output of
containerd --version:Latest client API
Any other relevant information:
I believe the source of it is here
Note that only the "Write" stream has
ClientStreamsset totrue, while the others are not set (default tofalse).When we call
contentClient.Readhere, we create a new stream, passing thedescof the"Read", which, as stated above, is defaulted tofalse:The really interesting part is when the above code calls
SendMsg. SinceClientStreamsis set tofalse, it immediately setssentLast = truein grpc, see here:Because of this, when we call
CloseSend()in our code above, it gets ignored in grpc, see hereThey even have a comment for it above:
TODO: return an error and finish the stream instead, due to API misuse?Unless I misunderstood, we are doing the API misuse they reference?
It is possible I completely misunderstood it, but I see no way how we can read data without creating those lingering zombie goroutines.
As soon as I can get a simple piece of code to reproduce it, I will post here. I can reproduce it, but only as part of a larger library, which will confuse.