Skip to content

[storage] "stream error: stream ID x; INTERNAL_ERROR" #784

@JeanMertz

Description

@JeanMertz

We are reading tens of millions of objects from GCS and seem to be hitting an issue where an error "stream error: stream ID 4163; INTERNAL_ERROR" is returned after processing files for a while.

It's pretty hard to debug the issue, as it takes several hours before the issue occurs, but we've had the issue two consecutive times in a row now.

We are using version eaddaf6dd7ee35fd3c2420c8d27478db176b0485 of the storage package.

Here's the pseudo code of what we are doing:

cs, err := cloudstorage.NewClient(ctx)
// err...
defer cs.Close()
b := cs.Bucket(...)
q := &storage.Query{Prefix: ...}
it := b.Objects(ctx, q)
for {
	a, err := it.Next()
	if err == iterator.Done {
		break
	}

	handleObject(...)
}

We have retry logic built into the handleObject function, but even retrying doesn't help. Also, once the error shows up, it doesn't go away anymore, reading of all lines and files now return the same error.

We're thinking of building some retry logic around the client itself, closing it and opening a new one to see if that works, and we're still digging deeper, but I wanted to report this nonetheless, in case anyone else has also run into this.

Metadata

Metadata

Assignees

Labels

🚨This issue needs some love.api: storageIssues related to the Cloud Storage API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions