Improve logging of long log lines#22982
Conversation
418dbb3 to
3954c88
Compare
|
ping @unclejack perhaps you can have a look as well? |
|
Revised to subject each call to the log driver's Log() method to a check if the copier's been Close()d, to make TestCopierSlow() pass. |
|
The journald reader was keying off of the "partial" flag incorrectly, fixed. |
7d17215 to
4f506c1
Compare
|
@thaJeztah Thanks! @nalind Can you add a benchmark to benchmark the numbers before and after, please? |
fd9c303 to
d166069
Compare
|
Sure. The larger message sizes are unlikely, so I tweaked the maximum size down to 16k to get better results for smaller sizes. Before:
After:
|
daemon/logger/copier.go
Outdated
There was a problem hiding this comment.
I'm a little not comfortable with passing part of a slice to Log. I'm not sure about current loggers if they can stash messages and send them later, but that would lead to messages corruption.
There was a problem hiding this comment.
We could allocate msg.Line and copy the slice into it every time, at the cost of a hit to speed and memory usage. Some of the log driver internals are complicated enough that I can't say for sure that we don't need to worry about that.
There was a problem hiding this comment.
Okay, dug into it.
- awslogs buffers Message structures on a channel and processes multiple Messages in batches in a separate goroutine. It could be tripped up by using slices.
- etwlogs uses the slice to build a string using fmt.Sprintf, so it'd be fine.
- fluentd copies the byte slice into a string immediately, so it'd be fine.
- gcplogs copies the byte slice into a string immediately, so it'd be fine.
- gelf copies the byte slice into a string immediately, so it'd be fine.
- journald copies the byte slice into a string immediately, so it'd be fine.
- jsonfilelog marshals the byte slice into a JSON blob immediately, so it'd be fine.
- splunk copies the byte slice into a string immediately, so it'd be fine.
- syslog copies the byte slice into a string immediately, so it'd be fine.
I'm tempted to make awslogs copy the data, and let the rest use slices, because otherwise we're copying the log data twice for most log drivers (once in Copier, and again in the log driver).
There was a problem hiding this comment.
I'm ok with this, let's document this also.
There was a problem hiding this comment.
I've added notes around the definition of Message and the new CopyMessage function, and a very short note in the awslog driver where it calls CopyMessage(). @LK4D4 is this what you were thinking, or are there other places this should be called out?
|
I like it very much. thanks @nalind |
Add a benchmark for measuring how the logger.Copier implementation handles logged lines of sizes ranging up from 64 bytes to 256KB. Signed-off-by: Nalin Dahyabhai <[email protected]>
daemon/logger/journald/read.go
Outdated
There was a problem hiding this comment.
I feel strlen("CONTAINER_PARTIAL_MESSAGE=true") is better than the magic number 30
There was a problem hiding this comment.
Okay, changing that.
This change updates how we handle long lines of output from the
container. The previous logic used a bufio reader to read entire lines
of output from the container through an intermediate BytesPipe, and that
allowed the container to cause dockerd to consume an unconstrained
amount of memory as it attempted to collect a whole line of output, by
outputting data without newlines.
To avoid that, we replace the bufio reader with our own buffering scheme
that handles log lines up to 16k in length, breaking up anything longer
than that into multiple chunks. If we can dispense with noting this
detail properly at the end of output, we can switch from using
ReadBytes() to using ReadLine() instead. We add a field ("Partial") to
the log message structure to flag when we pass data to the log driver
that did not end with a newline.
The Line member of Message structures that we pass to log drivers is now
a slice into data which can be overwritten between calls to the log
driver's Log() method, so drivers which batch up Messages before
processing them need to take additional care: we add a function
(logger.CopyMessage()) that can be used to create a deep copy of a
Message structure, and modify the awslogs driver to use it.
We update the jsonfile log driver to append a "\n" to the data that it
logs to disk only when the Partial flag is false (it previously did so
unconditionally), to make its "logs" output correctly reproduce the data
as we received it.
Likewise, we modify the journald log driver to add a data field with
value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is
true, and update its "logs" reader to refrain from appending a "\n" to
the data that it retrieves if it does not see this field/value pair (it
also previously did this unconditionally).
Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind)
|
LGTM ping @LK4D4 @cpuguy83 @unclejack |
|
ping anyone :) |
|
LGTM |
|
Let's merge it after 1.12 to not occasionally bring additional work with cherry-picks merging in release branch. |
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
|
@thaJeztah I take it this did not land in 1.12.x ? :( |
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
|
An easy way to reproduce the issue really fast: |
|
Please consider backporting to all your releases (1.9.x, 1.10.x, 1.11.x, 1.12.x) that are seriously affected by this bug. |
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
Upstream reference: moby#22982 This change updates how we handle long lines of output from the container. The previous logic used a bufio reader to read entire lines of output from the container through an intermediate BytesPipe, and that allowed the container to cause dockerd to consume an unconstrained amount of memory as it attempted to collect a whole line of output, by outputting data without newlines. To avoid that, we replace the bufio reader with our own buffering scheme that handles log lines up to 16k in length, breaking up anything longer than that into multiple chunks. If we can dispense with noting this detail properly at the end of output, we can switch from using ReadBytes() to using ReadLine() instead. We add a field ("Partial") to the log message structure to flag when we pass data to the log driver that did not end with a newline. The Line member of Message structures that we pass to log drivers is now a slice into data which can be overwritten between calls to the log driver's Log() method, so drivers which batch up Messages before processing them need to take additional care: we add a function (logger.CopyMessage()) that can be used to create a deep copy of a Message structure, and modify the awslogs driver to use it. We update the jsonfile log driver to append a "\n" to the data that it logs to disk only when the Partial flag is false (it previously did so unconditionally), to make its "logs" output correctly reproduce the data as we received it. Likewise, we modify the journald log driver to add a data field with value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is true, and update its "logs" reader to refrain from appending a "\n" to the data that it retrieves if it does not see this field/value pair (it also previously did this unconditionally). Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind) Signed-off-by: Antonio Murdaca <[email protected]>
This change updates how we handle long lines of output from the
container. The previous logic used a bufio reader to read entire lines
of output from the container through an intermediate BytesPipe, and that
allowed the container to cause dockerd to consume an unconstrained
amount of memory as it attempted to collect a whole line of output, by
outputting data without newlines.
To avoid that, we replace the bufio reader with our own buffering scheme
that handles log lines up to 16k in length, breaking up anything longer
than that into multiple chunks. If we can dispense with noting this
detail properly at the end of output, we can switch from using
ReadBytes() to using ReadLine() instead. We add a field ("Partial") to
the log message structure to flag when we pass data to the log driver
that did not end with a newline.
The Line member of Message structures that we pass to log drivers is now
a slice into data which can be overwritten between calls to the log
driver's Log() method, so drivers which batch up Messages before
processing them need to take additional care: we add a function
(logger.CopyMessage()) that can be used to create a deep copy of a
Message structure, and modify the awslogs driver to use it.
We update the jsonfile log driver to append a "\n" to the data that it
logs to disk only when the Partial flag is false (it previously did so
unconditionally), to make its "logs" output correctly reproduce the data
as we received it.
Likewise, we modify the journald log driver to add a data field with
value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is
true, and update its "logs" reader to refrain from appending a "\n" to
the data that it retrieves if it does not see this field/value pair (it
also previously did this unconditionally).
fix http://code.huawei.com/docker/docker/issues/324
upsteam issue: moby#18057
fix DTS2017062307255
cherry-pick from: moby#22982
conflicts:
daemon/logger/copier.go
daemon/logger/journald/read.go
daemon/logger/logger.go
Signed-off-by: Nalin Dahyabhai <[email protected]> (github: nalind)
Signed-off-by: Lei Jitang <[email protected]>
Add a benchmark for measuring how the logger.Copier implementation handles logged lines of sizes ranging up from 64 bytes to 256KB. cherry-pick from: moby#22982 conflicts: daemon/logger/copier_test.go Signed-off-by: Nalin Dahyabhai <[email protected]> Signed-off-by: Lei Jitang <[email protected]>
- What I did
Worked up a proposal for addressing the memory usage in #18057.
- How I did it
Switch from using a reader from the bufio package to doing the buffering ourselves, and tracking whether or not we read the end of a line. That information is now passed to log drivers so that they can do the right thing when they're sending entries to clients that call a container's logs handler.
- How to verify it
Run
docker run -d busybox /bin/sh -c 'while true ; do echo -n "aaasd"; done'and observe usage either withtopusing memory sorting, or with DEBUG set,dockerdlistening on a TCP port, andwatch go tool pprof --text /usr/bin/docker-current http://$hostname:$port/debug/pprof/heap- Description for the changelog
Improve logging of long log lines