Skip to content

zombie goroutines in client due to missing ClientStreams #4617

@deitch

Description

@deitch

Description

Reading a large blob causes hanging goroutines of type grpc.newClientStream in the client.

Detail

It appears that each time we call ReadAt(), it creates a new stream, which is not marked as ClientStreams: true. This, in turn, makes it mark lastSent: true immediately, and never closes the client stream.

Each "ReadAt" chunk is 1MB, so you end up with one lingering goroutine per 1MB of blob size. Pull a 1GB blob, get ~1000 goroutines.

Steps to reproduce the issue:

I am working on a narrow reproducibility case, which I will post here as soon as I have code for it.

Output of containerd --version:

Latest client API

Any other relevant information:

I believe the source of it is here

	Streams: []grpc.StreamDesc{
		{
			StreamName:    "List",
			Handler:       _Content_List_Handler,
			ServerStreams: true,
		},
		{
			StreamName:    "Read",
			Handler:       _Content_Read_Handler,
			ServerStreams: true,
		},
		{
			StreamName:    "Write",
			Handler:       _Content_Write_Handler,
			ServerStreams: true,
			ClientStreams: true,
		},
	},

Note that only the "Write" stream has ClientStreams set to true, while the others are not set (default to false).

When we call contentClient.Read here, we create a new stream, passing the desc of the "Read", which, as stated above, is defaulted to false:

func (c *contentClient) Read(ctx context.Context, in *ReadContentRequest, opts ...grpc.CallOption) (Content_ReadClient, error) {
	stream, err := c.cc.NewStream(ctx, &_Content_serviceDesc.Streams[1], "/containerd.services.content.v1.Content/Read", opts...)
	if err != nil {
		return nil, err
	}
	x := &contentReadClient{stream}
	if err := x.ClientStream.SendMsg(in); err != nil {
		return nil, err
	}
	if err := x.ClientStream.CloseSend(); err != nil {
		return nil, err
	}
	return x, nil
}

The really interesting part is when the above code calls SendMsg. Since ClientStreams is set to false, it immediately sets sentLast = true in grpc, see here:

	if !cs.desc.ClientStreams {
		cs.sentLast = true
	}

Because of this, when we call CloseSend() in our code above, it gets ignored in grpc, see here

func (cs *clientStream) CloseSend() error {
	if cs.sentLast {
		// TODO: return an error and finish the stream instead, due to API misuse?
		return nil
	}

They even have a comment for it above: TODO: return an error and finish the stream instead, due to API misuse?

Unless I misunderstood, we are doing the API misuse they reference?

It is possible I completely misunderstood it, but I see no way how we can read data without creating those lingering zombie goroutines.

As soon as I can get a simple piece of code to reproduce it, I will post here. I can reproduce it, but only as part of a larger library, which will confuse.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions