Skip to content

Flaky test: awslogs tests failing with ResourceAlreadyExistsException #39857

@thaJeztah

Description

@thaJeztah

Not sure if related to similar errors, but there were some fixes for race-conditions in the past (see #22963, and #22985)

Seen this failing a couple of times:

Failure on RS1 (but I may have seen this on other platforms); https://ci.docker.com/public/blue/rest/organizations/jenkins/pipelines/moby/branches/PR-39855/runs/1/nodes/39/log/?start=0

time="2019-09-03T10:15:19Z" level=info msg="Trying to get region from EC2 Metadata"
time="2019-09-03T10:15:19Z" level=info msg="Log stream already exists" errorCode=ResourceAlreadyExistsException logGroupName= logStreamName= message= origError="<nil>"
--- FAIL: TestLogBlocking (0.02s)
    cloudwatchlogs_test.go:313: Expected to be able to read from stream.messages but was unable to
time="2019-09-03T10:15:19Z" level=error msg=Error
time="2019-09-03T10:15:19Z" level=error msg="Failed to put log events" errorCode=InvalidSequenceTokenException logGroupName=groupName logStreamName=streamName message="use token token" origError="<nil>"
time="2019-09-03T10:15:19Z" level=error msg="Failed to put log events" errorCode=DataAlreadyAcceptedException logGroupName=groupName logStreamName=streamName message="use token token" origError="<nil>"
time="2019-09-03T10:15:19Z" level=info msg="Data already accepted, ignoring error" errorCode=DataAlreadyAcceptedException logGroupName=groupName logStreamName=streamName message="use token token"
FAIL
coverage: 78.2% of statements
FAIL	github.com/docker/docker/daemon/logger/awslogs	0.631s

Interesting bit in that case is that the test that was marked failing looks to be using a mockClient (so should not even be connecting to AWS?)

func TestLogBlocking(t *testing.T) {
mockClient := newMockClient()
stream := &logStream{
client: mockClient,
messages: make(chan *logger.Message),
}
errorCh := make(chan error, 1)
started := make(chan bool)
go func() {
started <- true
err := stream.Log(&logger.Message{})
errorCh <- err
}()
<-started
select {
case err := <-errorCh:
t.Fatal("Expected stream.Log to block: ", err)
default:
break
}
select {
case <-stream.messages:
break
default:
t.Fatal("Expected to be able to read from stream.messages but was unable to")
}
select {
case err := <-errorCh:
assert.NilError(t, err)
case <-time.After(30 * time.Second):
t.Fatal("timed out waiting for read")
}
}

For other tests, we seem to be reusing the same groupName and logStreamName, so I suspect this is just a race condition (and AWS doesn't like attempting to create duplicates);

const (
groupName = "groupName"
streamName = "streamName"
sequenceToken = "sequenceToken"
nextSequenceToken = "nextSequenceToken"
logline = "this is a log line\r"
multilineLogline = "2017-01-01 01:01:44 This is a multiline log entry\r"
)

Should we add a unique suffix to the groupName and logStreamName in the tests? (e.g., use the test name as suffix? groupName+t.Name())

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions