Add container metrics fields from ECS#87
Add container metrics fields from ECS#87ChrsMark wants to merge 2 commits intoopen-telemetry:mainfrom
Conversation
8a10c6f to
3dd54a3
Compare
995ab45 to
1076e36
Compare
Signed-off-by: ChrsMark <[email protected]>
1076e36 to
d781478
Compare
Signed-off-by: ChrsMark <[email protected]>
|
|
||
| ## HTTP Server | ||
|
|
||
| ### Metric: `metric.container.cpu.usage` |
There was a problem hiding this comment.
The name should just be container.cpu.usage
The metric. classifier is for our YAML database, not the metric name. (Sorry for confusion)
Same comment for all other metrics.
| groups: | ||
| - id: metric.container.cpu.usage | ||
| type: metric | ||
| metric_name: container.cpu.usage |
There was a problem hiding this comment.
is this needed separate from system.cpu.*?
(there was some related discussion in open-telemetry/opentelemetry-specification#2388)
There was a problem hiding this comment.
Thank's @trask! That's interesting one and it's quite similar to open-telemetry/opentelemetry-specification#2388 (comment).
I will try to share my view on this :).
So, from infrastructure's observability point of view, container metrics would be collected from outside the containers themselves (as a best practice).
For example by using the Docker runtime's APIs or Kubelet's API's.
So the point here is that it does not make a lot of sense to treat a container as a host and collect its metrics by running a collector inside the container.
So for metrics collected through the runtime/orchastrator APIs I think we need to be specific and have a resource specific namespace like container.cpu, pod.cpu etc.
In many cases the CPU/Memory resources are limited as well:
- https://docs.docker.com/config/containers/resource_constraints/#cpu
- https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit
In very specific usecases, if someone wants to treat a container as a host (kind uses containers' as hosts) then the collector should be running on the container/host directly and report metrics under the system.* namespace. In that case the resource is a "host" not a container from the observation/collection point of view.
Also related to #226 (comment).
| on all network interfaces | ||
| by the container since | ||
| the last metric collection. | ||
| instrument: gauge |
There was a problem hiding this comment.
Should be a counter instead (since values are additive)?
| on all network interfaces | ||
| by the container since | ||
| the last metric collection. | ||
| instrument: gauge |
| The total number of bytes written | ||
| successfully (aggregated from all disks) | ||
| since the last metric collection | ||
| instrument: gauge |
There was a problem hiding this comment.
| The total number of bytes read | ||
| successfully (aggregated from all disks) | ||
| since the last metric collection. | ||
| instrument: gauge |
|
Closing in honor of #282. After the project restructuring it was hard to rebase and recover this branch due to some weird permission errors. Starting fresh was faster. |
This PR adds container related metrics fields as part of #72.
cc: @AlexanderWert @kaiyan-sheng @mlunadia