-
Notifications
You must be signed in to change notification settings - Fork 42.8k
Description
/kind feature
We assert that a log file on-disk (via CRI-O, or docker's json-file logging driver) should have metadata associated with that log file to describe the UUIDs and names of the container, pod, and namespace associated with the log output to avoid k8s API calls later. Today it does not appear to include the namespace UUID or name (see
kubernetes/pkg/kubelet/kuberuntime/helpers.go
Line 220 in 6caf343
| return filepath.Join(podLogsRootDirectory, string(podUID)) |
Ideally (and arguably), a log collection system should not be required to query the Kube APIs about the log data in order to find the UUIDs of the k8s resource hierarchy associated with a container. Only optional metadata that decorates or enriches the log data would necessitate API calls to fetch that metadata; e.g., if one wanted to decorate logs with one or more labels or annotations from a pod or namespace. A log collection system should be able to properly disambiguate and group logs accurately without additional API calls which might cause undo burden on the API server. Since any metadata needed to be fetched via API calls is optional, it can be dropped when inconvenient or added later as a post-processing step.
The "location" of a container in a single k8s cluster is fully described by four pieces of information:
- hostname
- container UUID (and name)
- pod UUID (and name)
- namespace UUID (and name)
With the above information, understanding the origin of one or more log entries in an aggregation of logs can be accomplished unambiguously with respect to k8s in one cluster. Today, it appears that k8s provides 3 of the 4 pieces of data, hostname (implicitly), container, and pod, when constructing the symlink for the log file path.
Can we consider adding namespace UUID and namespace name as well?