Skip to content

daemon/logger/fluentd: deflake TestReadWriteTimeoutsAreEffective#52447

Open
firasmosbehi wants to merge 1 commit intomoby:masterfrom
firasmosbehi:deflake-fluentd-timeout-test
Open

daemon/logger/fluentd: deflake TestReadWriteTimeoutsAreEffective#52447
firasmosbehi wants to merge 1 commit intomoby:masterfrom
firasmosbehi:deflake-fluentd-timeout-test

Conversation

@firasmosbehi
Copy link
Copy Markdown

- What I did

Fixed two sources of flakiness in TestReadWriteTimeoutsAreEffective:

  1. Socket path too long on macOS: t.TempDir() on macOS produces paths that exceed the Unix domain socket path length limit (~104 bytes on macOS, ~108 on Linux). The test now falls back to a shorter path under /tmp when the generated path is too long.

  2. 10-second hangs in read_timeout subtest: noAckConnectionHandler kept the connection open until the test context expired (10s). When the fluentd client's read timeout didn't fire promptly, the test hung for the full duration, occasionally exceeding the overall test timeout. The handler now closes the connection after a 100ms delay instead of waiting for the test context.

Also increased the per-subtest timeout from 10s to 30s to provide more headroom under CI load.

- How to verify it

go test ./daemon/logger/fluentd/ -v -run TestReadWriteTimeoutsAreEffective

Fixes #51079

This test was flaky for two reasons:

1. On macOS, t.TempDir() produces very long paths that exceed the
   Unix domain socket path length limit (~104 bytes). Fall back to
   a shorter path under /tmp when needed.

2. The noAckConnectionHandler kept the connection open until the
   10-second test context expired. When the fluentd client's read
   timeout didn't fire promptly, the test hung for the full 10s,
   sometimes causing the overall test to exceed its timeout.
   Close the connection after a short delay instead of waiting for
   the test context.

Also increase the per-subtest timeout from 10s to 30s to provide
more headroom under CI load.

Fixes moby#51079

Signed-off-by: Kimi Code CLI <[email protected]>
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the DCO sign-off is not correct;

Signed-off-by: Kimi Code CLI <[email protected]>

@thaJeztah thaJeztah added the dco/no Automatically set by a bot when one of the commits lacks proper signature label Apr 25, 2026
// to a shorter path under /tmp.
if len(socketFile) > 100 {
var err error
tmpDir, err = os.MkdirTemp("/tmp", "fluentd-test")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should use "" (use default) or explicitly os.TempDir() here, instead of hard-coding /tmp 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/daemon Core Engine area/logging dco/no Automatically set by a bot when one of the commits lacks proper signature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flaky test: daemon/logger/fluentd TestReadWriteTimeoutsAreEffective

2 participants