I'd like to propose the following:
- Move to a single representation of "name" for signals.
- Here
group.name means "metric name", "span type", "event name" and "entity type".
- This name forms a namespace per signal. So the entity with name
host means the same thing everywhere
- This name is something we'll make sure is transmitted via OTLP and enforce-able in live-check.
- Clarify what "id" means for groups.
group.id denotes an identifier for understanding a definition in weaver/semconv YAML files.
- The id forms a namespace per group definition.
- We expect code and documentation generation to interact at this level in weaver.
- e.g. a
java.apache.http.client.duration group id would refer to a metric of name http.client.duration. This may be a specialized instance of http.client.duration used to codegen for the Java Apache HTTP client library, and can be referenced when generating documentation for that library.
Background
The difference between id and actual identity of a signal has been a sore spot in semantic conventions for some time.
For example, today all semantic convention compatibility policies are enforced at the name level. This means there is no enforcement for spans today: an attribute may be dropped with no automated check. Additionally, we have issues in weaver with resolving registries and enforcing "extension" group ids.
We actually already have this issue with attributes. Today, only registries which match semantic conventions usage of registry.* attribute groups can use weaver diff.
What's the meaning of id vs. name
We should view name and id the same way we see a type and an instance (or term) in a programming language. For example String x, we would have String be the name, and x be the id. The identity denotes a string in a more well understood context and may have more limitations that the general String.
Within weaver, today, we do not define "type" and "instance" separately. Instead, e.g. with attributes we've promoted a special usage in Semantic Conventions where we define "registry" groups, and use ref to refine an attribute within a specific context. Additionally, weaver allows extend on group to allow refinement of a definition or re-use of a set of attributes.
Refinement
A key aspect of this proposal is that we understand when a group or attribute is the "root source" of the definition vs. when it is a refinement. I propose adding refinement tracking in weaver with the following rules:
- An attribute that is a
ref is a refinement of what it refers.
- A group that extends another group with the same type is a refinement to what it extends.
- e.g. An attribute group
shared.attributes that is extended by my.span group would NOT be the source of truth for my.span, but a java.apache.http.client.duration metric group that extends metric.http.client.duration metric group WOULD be a refinement.
Implications for weaver resolve
- We can create signal registries for
metric, events, etc. if desired.
- We should only allow shared
name between groups IFF one group refines the other.
Implications for weaver diff
- Weaver diff MUST operate three levels:
- "global" Attribute differences (which already use 'name')
- "global" Signal differences (which will use 'name' going forward)
- Group differences (A new output, to be designed).
- We will use
refinement in diff.
- Attribute differences only work on "source", not refinements.
- Signal differences only work on "source", not refinements.
- We will likely need to provide diff for refinements between versions.
This is akin to the apply_to_metrics config in existing Telemetry Schema.
I think this should be follow on work
Implications for live-check
- Live check can only enforce at the signal (name) level.
- It will need the same capability to understand the "source" of a signal vs. a specialized instance (e.g. the raw rules of
http.client.duration vs. the Java Apache HTTP client specific http.client.duration)
implications for emit
Emit should continue to emit all groups independently. The specialized instances are important to demonstrate downstream.
I'd like to propose the following:
group.namemeans "metric name", "span type", "event name" and "entity type".hostmeans the same thing everywheregroup.iddenotes an identifier for understanding a definition in weaver/semconv YAML files.java.apache.http.client.durationgroup id would refer to a metric of namehttp.client.duration. This may be a specialized instance ofhttp.client.durationused to codegen for the Java Apache HTTP client library, and can be referenced when generating documentation for that library.Background
The difference between
idand actual identity of a signal has been a sore spot in semantic conventions for some time.For example, today all semantic convention compatibility policies are enforced at the name level. This means there is no enforcement for spans today: an attribute may be dropped with no automated check. Additionally, we have issues in weaver with resolving registries and enforcing "extension" group ids.
We actually already have this issue with attributes. Today, only registries which match semantic conventions usage of
registry.*attribute groups can use weaver diff.What's the meaning of
idvs.nameWe should view
nameandidthe same way we see a type and an instance (or term) in a programming language. For exampleString x, we would haveStringbe the name, andxbe the id. The identity denotes a string in a more well understood context and may have more limitations that the generalString.Within weaver, today, we do not define "type" and "instance" separately. Instead, e.g. with attributes we've promoted a special usage in Semantic Conventions where we define "registry" groups, and use
refto refine an attribute within a specific context. Additionally, weaver allowsextendon group to allow refinement of a definition or re-use of a set of attributes.Refinement
A key aspect of this proposal is that we understand when a group or attribute is the "root source" of the definition vs. when it is a refinement. I propose adding refinement tracking in weaver with the following rules:
refis a refinement of what it refers.shared.attributesthat is extended bymy.spangroup would NOT be the source of truth formy.span, but ajava.apache.http.client.durationmetric group that extendsmetric.http.client.durationmetric group WOULD be a refinement.Implications for
weaver resolvemetric,events, etc. if desired.namebetween groups IFF one group refines the other.Implications for
weaver diffrefinementin diff.This is akin to the
apply_to_metricsconfig in existing Telemetry Schema.I think this should be follow on work
Implications for
live-checkhttp.client.durationvs. the Java Apache HTTP client specifichttp.client.duration)implications for
emitEmit should continue to emit all groups independently. The specialized instances are important to demonstrate downstream.