[JMX Insight] Hadoop jmx metics semconv alignment#14411
[JMX Insight] Hadoop jmx metics semconv alignment#14411trask merged 19 commits intoopen-telemetry:mainfrom
Conversation
|
|
||
| | Metric Name | Type | Attributes | Description | | ||
| |---------------------------------|---------------|-------------------------------------|--------------------------------------------------------| | ||
| | hadoop.dfs.capacity | UpDownCounter | hadoop.node.name | Current raw capacity of data nodes. | |
There was a problem hiding this comment.
[for reviewer] Naming prefix of metrics has been change to utilize a metric context (dfs).
Context is described in official docs: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Metrics.html
|
🔧 The result from spotlessApply was committed to the PR branch. |
…ent' into hadoop-jmx-metics-semconv-alignment
…unday/opentelemetry-java-instrumentation into hadoop-jmx-metics-semconv-alignment
|
|
||
| | Metric Name | Type | Attributes | Description | | ||
| |---------------------------------|---------------|-------------------------------------|--------------------------------------------------------| | ||
| | hadoop.dfs.capacity | UpDownCounter | hadoop.node.name | Current raw capacity of data nodes. | |
There was a problem hiding this comment.
The capacity is in bytes, so maybe we should start adding a dedicated column for the units (then we can do the same for other metrics as follow-up).
There was a problem hiding this comment.
also not sure if we should have a metric that is a prefix of another, maybe capacity.raw could work here to replicate the wording in the docs.
There was a problem hiding this comment.
I did not find any official recommendation regarding metric name being prefix of another metric, and in my opinion in this case hadoop.dfs.capacity.used is indeed a part of total hadoop.dfs.capacity.
I'd take it to SIG meeting, if this is confusing for others as well then I'll change hadoop.dfs.capacity to hadoop.dfs.capacity.raw.
There was a problem hiding this comment.
On SIG meeting we talked about that. Indeed we should avoid using metric name as a prefix for another metric.
Also we decided to add .limit suffix, which is aligned with semconv recommendation.
| # hadoop.dfs.data_node.live | ||
| NumLiveDataNodes: | ||
| metric: data_node.live | ||
| type: updowncounter | ||
| unit: "{node}" | ||
| desc: Number of data nodes which are currently live. | ||
| # hadoop.dfs.data_node.dead | ||
| NumDeadDataNodes: | ||
| metric: data_node.dead | ||
| type: updowncounter | ||
| unit: "{node}" | ||
| desc: Number of data nodes which are currently dead. |
There was a problem hiding this comment.
The datanode and namenode wording is quite widespread in the hadoop docs, so I think we could get rid of the _ (but this is probably my personal preference here), also those are metrics that relate to the cluster and the nodes, and not related to the dfs part, even though they are captured on the same MBean object as the others.
So I would suggest to simplify those to:
hadoop.datanode.liveinstead ofhadoop.dfs.data_node.livehadoop.datanode.deadinstead ofhadoop.dfs.data_node.dead
There was a problem hiding this comment.
Hadoop docs are quite inconsistent here. I saw DataNode and NameNode pascal case notation in many places, so I converted them to snake case. This follows this recommendation (see bullet starting with "For each multi-word dot-delimited component") and to match other metric names, like jvm.file_descriptor.count, http.client.open_connections, etc...
I agree that lowercase is easier to read, so I'd gladly change it. The question is what others think about it...
I'll get rid of dfs segment - good observation.
Co-authored-by: SylvainJuge <[email protected]>
…es/hadoop.yaml Co-authored-by: Trask Stalnaker <[email protected]>
Co-authored-by: otelbot <[email protected]> Co-authored-by: SylvainJuge <[email protected]> Co-authored-by: Trask Stalnaker <[email protected]>
Fixes #14274
Includes: