-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Component Logging Metadata
Nota bene: this is a very long document for what is essentially a very small change.
All code examples in this document are runnable by wrapping them in the main
function:
import (
"context"
"github.com/docker/swarmkit/log"
"github.com/sirupsen/logrus"
)
func main() {
ctx := context.Background()
// begin example block
// ...
// end example block
}Problem
In the Docker daemon, various subcomponents (the engine, swarmkit, and
libnetwork, for example) live in the same process, and all of their log
messages are written to the same stream. This creates a situation where
debugging via logs requires a technical understanding of the components in
order to understand the source of log messages. All logging is done through
logrus, but there is no agreed upon
logging pattern or format.
The result is a hodge-podge of logs, all originating from the same program,
with no easy way to tell them apart.
Proposal
Logrus provides a way to add metadata to logs in the form of "fields", which
are used extensively in swarmkit to annotate log sources, components, etc.
Swarmkit also uses an internal library at github.com/docker/swarmkit/log,
which provides useful helper methods for propagating logging metadata
through a call tree by embedding it into contexts. This could form the basis
of a retrofitting logging metadata.
I propose this process:
- Establish a common logging metadata tags or set of tags that are used to
indicate which sub-component a log message originates from. - Modify logging in all subcomponents to include this metadata in log
messages.
Optionally:
- Break
github.com/docker/swarmkit/logout into its own package and
repository, so that it can easily be used by other components without concern
over vendoring all of swarmkit - Encourage the adoption of the swarmkit pattern of annotating logs and
propagating log metadata by including the logger in the context
Common Logging Metadata
In swarmkit, the module metadata is used to differentiate logging messages
from the various components. The next part of this proposal would be to
establish the module field on the log messages as part of standard metadata
for every component in Moby. This means every component would, at the very
least, include the metadata module=component.
When complete, this means that the daemon log output would look less like this:
[0000]INFO Where does this even come from
[0000]DEBU Starting Manager module=node
[0000]DEBU Initiating bulk sync
And more like this:
[0000]INFO Where does this even come from module=daemon
[0000]DEBU Starting Manager module=swarmkit/node
[0000]DEBU Initiating bulk sync module=libnetwork
This is a very small change to log message output, but makes searching logs,
debugging, and understanding the interaction between the disparate, concurrent
components an easier proposition. This also enables end users to better filter
their logs using third-party distributed logging drivers.
Modifying Subcomponents
For swarmkit, this change is trivial. Our log library can be modified in
essentially 1 line to prefix all modules with swarmkit. Whether or not
subcomponents opt to use the log library is their choice. However, all
current subcomponent use logrus as thier logging library, which means
implementing this simple metadata tag is easy. To do so, the subcomponent,
instead of using the default logrus driver (which is accessed through methods
like logrus.Info, they can set a global variable with a sub-log, like L = logrus.WithField("module", "componentname"), and then find-and-replace
logrus.Function with theirLog.L.Function.
// consider a module. I'm using an anonymous struct here for its identical
// syntax (in this case) to a module with a global variable.
log := struct{
L logrus.Entry
}
// before change:
logrus.Info("Hey, be aware of this thing")
// after change:
log.L.Info("Hey, be aware of this thing")Swarmkit log library
In code, the instead of calling logrus directly, it's access through the
context. This looks something like this:
// log.G(ctx) returns a logrus.Entry and can replace logrus wherever it occurs
// in the code
log.G(ctx).Info("This is a log line!")The swarmkit approach to logging uses the WithValue function of the standard
library context to pass a logrus.Entry, a logrus logger, through a call
tree, preserving metadata. The log library is used to transparently affix this
metadata
// this creates a new subcontext
ctx := log.WithField(ctx, "metadata", "foo")
// will print something like:
// INFO[0000] This log line has metadata! metadata=foo
log.G(ctx).Info("This log line has metadata!")Now, whever this context is passed, the logger goes with it.
// the variable innerctx is used to avoid the readability issues associated with
// shadowing in closures
example := func(innerctx context.Context) {
log.G(innerctx).Info("This log has the same metadata as its parent!")
}
ctx := log.WithField(ctx, "metadata", "foo")
// Prints the following:
// [0000]INFO This log line has metadata metadata=foo
// [0000]INFO This log has the same metadata as its parent! metadata=foo
log.G(ctx).Info("This log line has metadata!")
example(ctx)The other major function the log library does is intelligently manage
"modules" of a call tree. This is useful in cases like the swarmkit manager,
where long-running routines don't necessarily have a traditional "call stack".
The log.WithModule method takes a context and a string, if not identical to
to the existing module field, adds the string, slash separated, to the module.
// logs using this context will have module=top
ctx := log.WithModule(ctx, "first")
log.G(ctx).Info("This line will have module=first")
// logs using this context will have module=first/second
ctx2 := log.WithModule(ctx, "second")
log.G(ctx2).Info("This line will have module=first/second")
// logs using this context will still have module first/second
// NOT module first/second/second
ctx3 := log.WithModule(ctx, "second")
log.G(ctx3).Info("This line will have module=first/second")Breaking this library out is trivial. It has no dependencies to the broader
swarmkit code, and if moved, can be updated with a simple module path rewrite.
Encouraging Adoption
This change wouldn't be expected in one big PR. Rather, it's acceptable for
this change to be a long-running goal. The immediate goal would be to simply
swap out the logger as described in Step 2. As components are modified and
updated, maintainers should encourage contributors to use the log library,
and to propagate logging through contexts.