Area(s)
area:new
Is your change request related to a problem? Please describe.
Some attributes may may be very large such as:
In order to provide a very tight SLO and to provide consistent performance, an operations backend might provide very tight limits on the size of payloads/content/attributes. However, it may still be desirable when this is the case to be able to index most of the properties/attributes as well as to cross-link to an alternative storage solution (with a looser SLO) for full, large data.
In the OTel Collector Contrib repository, there is an open issue for a new connector component ("New component: Blob Attribute Uploader Connector") which attempts to resolve the above issue by proposing a connector that will, in the instrumentation/collector, do the following:
-
Upload the full attribute data to a blob storage system (such as Google Cloud Storage, Amazon S3, Azure Blob storage, or -- in the future -- any other blob storage backend supported by the CDK or that chooses to contribute to the connector).
-
Remove the original attribute that would otherwise be too large.
-
Inject attribute(s) that contain references to where the data was uploaded.
It is with respect to this last one, that we need an established standard / data model for how to represent the uploaded data. The "SetInMap" function in the "foreign attribute" internal library of the draft connector shows the solution currently being implemented/pursued. However, I'd like to get more community feedback/input in order to ensure that there is agreement on the approach.
Describe the solution you'd like
Summary:
| Before uploading |
After uploading |
somekey |
somekey.ref.uri |
|
somekey.ref.content_type [OPTIONAL] |
Formally:
A backend system that sees an attribute matching the pattern "${prefix}.ref.uri" should assume that "${prefix}.ref.uri" contains a URI that reports that location of the original, full value of "${prefix}". If a key named "${prefix}" also is present, it is likely that "${prefix}" contains a truncated and/or redacted copy of the original value, with "${prefix}.ref.uri" pointing to the location where the original, unadulterated, full version of the value has been recorded. If "${prefix}.ref.uri" exists, then "${prefix}.ref.content_type" may also optionally exist, containing a MIME type describing the data stored at the URI in "${prefix}.ref.uri".
Prerequisites / Interactions:
If this course of action were to be taken, we would need two changes to the "Attribute Naming" specification in OTel:
-
Reserve ".ref." so that it can only be used for this purpose (".ref." to be used only for information about reference-typed values and their metadata).
-
Exempt this usage from the rule that "Names SHOULD NOT coincide with namespaces"
With respect to the latter exemption, our understanding is that this rule exists to allow coaelescing the flat attributes into a structured object. We believe that ".ref." fits the spirit of this even while violating the literal rule as it exists now, because the ".ref." does not introduce a conceptually new attribute; rather, it replaces the existing attribute with a different representation. Thus if a backend were to implement a coalescing system to make attributes non-flat, they could combine all of the ".ref." attributes into a single "ReferenceValue" type as the value for the top-level attribute that does not contain any ".ref." value in it.
Describe alternatives you've considered
Use a compound data type
For example, we could use a "KeyValueList kvlist_value" to represent the reference not as a bunch of separate attributes but as a single complex attribute value. However, this would go contrary to the attribute specification as well as existing OTel libraries which do not provide direct access to the underlying data model in OTLP following this convention restriction.
Introduce a new data type
We could attempt to introduce into OTLP (and into other parts of the specification) a new "ReferenceValue" data type. However, this would require significant effort across multiple languages, libraries, backend systems, etc. to support and thus seems like a non-starter.
Just move the large attributes to be events/logs
This assumes that the event/logs backend can, itself, accept arbitrarily large data. This assumption may not hold true for every event/log backend system. There is an inherent tradeoff between reliabilty/latency and the amount of data that a backend can allow; to provide a very tight latency SLO that caps 99%-ile latency, it's important to guarantee low variance in the size of requests which, in turn, may limit the amount of data that an event/log backend can accept (while being able to provide such a tight time bound). When using such a system for indexing the metadata about the events and making them quickly available for searching, it would still be useful to have a client-side mechanism to route larger content that is not critical for indexing to a blob storage system more suitable for large object storage and to be able to make it easy to interlink and navigate between backend systems by referencing the storage location.
Replace/upload just the entire body of events/logs
While this might serve the purpose of "gen_ai.prompt" / "gen_ai.completion" if/when it moves from a span event to a more general event, it would still be necessary to provide a data model for representing that the event body had been uploaded/replaced with a reference. In addition, this would not cover many other cases (e.g. "http.request.body.content"), and -- beyond this -- it is desirable to upload a much more targeted/limited portion of the data in order to allow indexing / searching the other content that does fit within backend subsystem limits.
Additional context
Examples
Example 1: http.response.body.content [Span Attribute]
Before
# TracesData [Before]
resource_spans: {
resource: { … }
scope_spans: {
scope: { … }
spans: {
trace_id: …
span_id: …
…
attributes: {
key: "http.response.body.content"
value: { string_value: "{ \"results\": [ … ] " } # very long API JSON repsonse
}
}
…
}
}
}
After
# TracesData [After]
resource_spans: {
resource: { … }
scope_spans: {
scope: { … }
spans: {
trace_id: …
span_id: …
…
attributes: {
key: "http.response.body.content.ref.uri"
value: { string_value: "gs://some-bucket/traceAttachments/12345/7890/response.json" }
}
attributes: {
key: "http.response.body.content.ref.content_type"
value: { string_value: "application/json" }
}
}
…
}
}
}
Example 2: gen_ai.prompt / gen_ai.completion [Span Event Attribute]
Before
# TracesData [Before]
resource_spans: {
resource: { … }
scope_spans: {
scope: { … }
spans: {
trace_id: …
span_id: …
…
events: {
# Note that this precise event name is mandatory per:
# https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
name: "gen_ai.content.prompt"
# …
attributes: {
key: "gen_ai.prompt"
value: { string_value: "Imagine that there is a very long text prompt here…." }
}
}
}
events: {
# Note that this precise event name is mandatory per:
# https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
name: "gen_ai.content.completion"
attributes: {
key: "gen_ai.completion"
value: { string_value: "{ \"completions\": [...]}" } # very long completion JSON
}
}
}
…
}
}
}
After
# TracesData [After]
resource_spans: {
resource: { … }
scope_spans: {
scope: { … }
spans: {
trace_id: …
span_id: …
…
events: {
# Note that this precise event name is mandatory per:
# https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
name: "gen_ai.content.prompt"
# …
attributes: {
key: "gen_ai.prompt.ref.uri"
value: { string_value: "s3://bucket/some/path/to/the/prompt.txt" }
}
attributes: {
key: "gen_ai.prompt.ref.content_type"
value: { string_value: "text/plain" }
}
}
}
events: {
# Note that this precise event name is mandatory per:
# https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
name: "gen_ai.content.completion"
attributes: {
key: "gen_ai.completion.ref.uri"
value: { string_value: "azblob://account/container/path/to/response.json" }
}
attributes: {
key: "gen_ai.completion.ref.content_type"
value: { string_value: "application/json" }
}
}
}
…
}
}
}
Area(s)
area:new
Is your change request related to a problem? Please describe.
Some attributes may may be very large such as:
http.request.body.contenthttp.response.body.contentgen_ai.promptgen_ai.completionexception.stacktraceexception.stacktraceIn order to provide a very tight SLO and to provide consistent performance, an operations backend might provide very tight limits on the size of payloads/content/attributes. However, it may still be desirable when this is the case to be able to index most of the properties/attributes as well as to cross-link to an alternative storage solution (with a looser SLO) for full, large data.
In the OTel Collector Contrib repository, there is an open issue for a new connector component ("New component: Blob Attribute Uploader Connector") which attempts to resolve the above issue by proposing a connector that will, in the instrumentation/collector, do the following:
Upload the full attribute data to a blob storage system (such as Google Cloud Storage, Amazon S3, Azure Blob storage, or -- in the future -- any other blob storage backend supported by the CDK or that chooses to contribute to the connector).
Remove the original attribute that would otherwise be too large.
Inject attribute(s) that contain references to where the data was uploaded.
It is with respect to this last one, that we need an established standard / data model for how to represent the uploaded data. The "SetInMap" function in the "foreign attribute" internal library of the draft connector shows the solution currently being implemented/pursued. However, I'd like to get more community feedback/input in order to ensure that there is agreement on the approach.
Describe the solution you'd like
Summary:
somekeysomekey.ref.urisomekey.ref.content_type[OPTIONAL]Formally:
A backend system that sees an attribute matching the pattern "${prefix}.ref.uri" should assume that "${prefix}.ref.uri" contains a URI that reports that location of the original, full value of "${prefix}". If a key named "${prefix}" also is present, it is likely that "${prefix}" contains a truncated and/or redacted copy of the original value, with "${prefix}.ref.uri" pointing to the location where the original, unadulterated, full version of the value has been recorded. If "${prefix}.ref.uri" exists, then "${prefix}.ref.content_type" may also optionally exist, containing a MIME type describing the data stored at the URI in "${prefix}.ref.uri".
Prerequisites / Interactions:
If this course of action were to be taken, we would need two changes to the "Attribute Naming" specification in OTel:
Reserve ".ref." so that it can only be used for this purpose (".ref." to be used only for information about reference-typed values and their metadata).
Exempt this usage from the rule that "Names SHOULD NOT coincide with namespaces"
With respect to the latter exemption, our understanding is that this rule exists to allow coaelescing the flat attributes into a structured object. We believe that ".ref." fits the spirit of this even while violating the literal rule as it exists now, because the ".ref." does not introduce a conceptually new attribute; rather, it replaces the existing attribute with a different representation. Thus if a backend were to implement a coalescing system to make attributes non-flat, they could combine all of the ".ref." attributes into a single "ReferenceValue" type as the value for the top-level attribute that does not contain any ".ref." value in it.
Describe alternatives you've considered
Use a compound data type
For example, we could use a "KeyValueList kvlist_value" to represent the reference not as a bunch of separate attributes but as a single complex attribute value. However, this would go contrary to the attribute specification as well as existing OTel libraries which do not provide direct access to the underlying data model in OTLP following this convention restriction.
Introduce a new data type
We could attempt to introduce into OTLP (and into other parts of the specification) a new "ReferenceValue" data type. However, this would require significant effort across multiple languages, libraries, backend systems, etc. to support and thus seems like a non-starter.
Just move the large attributes to be events/logs
This assumes that the event/logs backend can, itself, accept arbitrarily large data. This assumption may not hold true for every event/log backend system. There is an inherent tradeoff between reliabilty/latency and the amount of data that a backend can allow; to provide a very tight latency SLO that caps 99%-ile latency, it's important to guarantee low variance in the size of requests which, in turn, may limit the amount of data that an event/log backend can accept (while being able to provide such a tight time bound). When using such a system for indexing the metadata about the events and making them quickly available for searching, it would still be useful to have a client-side mechanism to route larger content that is not critical for indexing to a blob storage system more suitable for large object storage and to be able to make it easy to interlink and navigate between backend systems by referencing the storage location.
Replace/upload just the entire body of events/logs
While this might serve the purpose of "gen_ai.prompt" / "gen_ai.completion" if/when it moves from a span event to a more general event, it would still be necessary to provide a data model for representing that the event body had been uploaded/replaced with a reference. In addition, this would not cover many other cases (e.g. "http.request.body.content"), and -- beyond this -- it is desirable to upload a much more targeted/limited portion of the data in order to allow indexing / searching the other content that does fit within backend subsystem limits.
Additional context
Examples
Example 1:
http.response.body.content[Span Attribute]Before
After
Example 2:
gen_ai.prompt/gen_ai.completion[Span Event Attribute]Before
After