Skip to content

Added support for write-timeout support to fluentd plugin.#35716

Closed
imumesh18 wants to merge 3 commits intomoby:masterfrom
imumesh18:write-timeout-config
Closed

Added support for write-timeout support to fluentd plugin.#35716
imumesh18 wants to merge 3 commits intomoby:masterfrom
imumesh18:write-timeout-config

Conversation

@imumesh18
Copy link
Contributor

@imumesh18 imumesh18 commented Dec 6, 2017

Signed-off-by: dungeonmaster18 [email protected]

- What I did

  1. I will be adding write-timeout support for fleuntd plugin
  2. I have replaced the dependency in vendor.conf from commit id with vesion tag which are related fluentd plugin.

- A picture of a cute animal (not mandatory but encouraged)
maxresdefault

@imumesh18 imumesh18 changed the title Add support for write-timeout support to fluentd plugin. Added support for write-timeout support to fluentd plugin. Dec 6, 2017
@imumesh18
Copy link
Contributor Author

imumesh18 commented Dec 6, 2017

Hey @thaJeztah and @cpuguy83 . I have added support for write timeout. Please review it whenever you have time. Also I have replaced the commit id with version tag in vendor.conf for all the libraries used by fluentd plugin.

P.S Once you guys think this is good just ping me I will start the work on docs and auto complete command for bash and zsh. Also I will squash the commits.

Thanks

@thaJeztah thaJeztah added area/logging kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/1-design-review and removed status/0-triage labels Dec 6, 2017
wtd, err := time.ParseDuration(info.Config[writeTimeoutKey])
if err != nil {
return nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for negative durations and produce an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we need to check what status-code is returned by the API in case of an invalid value (haven't checked yet) (make sure it doesn't return a 5xx error)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also check for extremely long timeout values to prevent daemon waiting on the write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @anusha-ragunathan how long do you thing it should wait. I was thinking of something in minutes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah Actually I am completely new to these things so I am kind of learning how can I handle server error(5xx). It will take time but I will push the changes by next week. Actually I am bit busy with my college exams 🙈

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dungeonmaster18 I think logger has ring buffer for this case. @cpuguy83

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far I know ring buffer is nothing but the circular buffer with fixed size. Correct me if I am wrong @ripcurld0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dungeonmaster18 : regarding setting an upper limit on the timeout, lets skip that. Users should be aware of what they are setting for the timeout value. It's hard to predict a good max value. Also, I checked other timeouts. Other than disallowing negative values, there's no other check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the status-code; The API uses this code https://github.com/moby/moby/blob/master/api/server/httputils/errors.go#L33-L73 (and code from the errdefs package: https://github.com/moby/moby/tree/master/api/errdefs) to convert errors from the daemon into the correct HTTP status codes. You can try creating an API request with an incorrect value for the timeout, and check what status code is returned.

@thaJeztah
Copy link
Member

ping @anusha-ragunathan PTAL

defaultMaxRetries = math.MaxInt32

// Write() will not time out
defaultWriteTimeout = time.Duration(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still set a default timeout. The main thing we don't want to happen is to allow a dead logging backend lock up the daemon in some way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about 300ns?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do something much higher, we don't want to drop logs.
30s timeout should be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me.

@imumesh18
Copy link
Contributor Author

PTAL @thaJeztah @anusha-ragunathan @cpuguy83.

P.S should I address more 5xx errors. Right now I am only checking for 501 502 503. Would like to know your views on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this error handling. time.ParseDuration shouldn't be making an HTTP request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup it's wrong I realized later 🙈 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am bit confused with the http status code. Where should I be checking it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about InvalidParameter: write-timeout must be a positive integer @cpuguy83

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the error message that matters so much as the error type, which needs to implement this interface: https://github.com/moby/moby/blob/master/api/errdefs/defs.go#L9

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this can just use errdefs.InvalidParameter(errors.New("some error")) now.

Signed-off-by: dungeonmaster18 <[email protected]>
writeTimeout := defaultWriteTimeout
i, _ := strconv.Atoi(info.Config[writeTimeoutKey])
if info.Config[writeTimeoutKey] != "" && i >= 0 {
wtd, err := time.ParseDuration(info.Config[writeTimeoutKey])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can directly assign to writeTimeout

@thaJeztah
Copy link
Member

ping @dungeonmaster18 could you have a look at the review comments? Looks like the change to use the errdefs #35716 (comment) is still current

@olljanat
Copy link
Contributor

@imumesh18 CI build fails to:

18:38:29 daemon/logger/fluentd/fluentd.go:52:37:warning: unnecessary conversion (unconvert)
18:38:30 Build step 'Execute shell' marked build as failure

And there is reviews above which needs to be addressed

@derek derek bot added the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 22, 2018
@akerouanton
Copy link
Member

akerouanton commented Apr 8, 2019

Looks like this PR is stall for quite some time now. Would you mind if I resubmit it with review comments fixed? We actually got hit by a bug fixed here: as mentioned by cpuguy83 in this comment, having no default timeout value leads to daemon lock up whenever fluentd times out, thus producing the same symptoms as described in #38092 (non-blocking mode fixes the issue through).

@imumesh18 imumesh18 closed this Apr 25, 2025
@imumesh18 imumesh18 deleted the write-timeout-config branch April 25, 2025 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/logging kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/failing-ci Indicates that the PR in its current state fails the test suite status/1-design-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants