It seems that during #258, Adding links after Span creation was removed due to complexities around samplers. While I do appreciate the complexity, I wanted to highlight a scenario where this feature is needed and suggest a potential solution for how this can be accommodated, partially addressing the sampling decision issue.
Example Scenario
Consider a system that waits for several events and processes them as a batch, for example for a subscription in a real-time system. These are stored in a Queue and each event has its OWN injected SpanContext. These are then consumed in a single flush() Operations.
First event arrives
Queue=[E1(SpanContext1)] and a flush() is scheduled for the Subscription
Several events arrive in the meantime
Queue=[E1(SpanContext1),E2(SpanContext2),E3(SpanContext3)]
flush() is invoked
A Span gets created for the flushing of the events. In this case, I consider that the first event triggered the flush and hence makes sense to be the parent. So I might do something like:
message = queue.pop();
span = tracer.CreateSpan(name="flush", parent=message.span_context)
do {
do_something(message);
span.AddLink(message.span_context); // This might not be needed since the first is also parent
if ( /* early escape condition */ ) { // Lets say downstream is full
if (!queue.empty()) {
enqueueNextFlush(); // still messages to flush
}
break;
}
} while ( message = queue.pop() );
Let's say the escape condition was triggered after the second message... I expect the following Spans:
Span1(name="flush", parent="SpanContext1", links=["SpanContext1", "SpanContext2"])
Span2(name="flush", parent="SpanContext3", links=["SpanContext3"])
Additional Notes:
In this case, let's say that our events (E1, E2) are "compressed/conflated" and it no longer makes sense to track them independently.
Also, it is not desired for the queue to be consumed ahead of processing (and for that matter, the escape condition may not yet be known until the processing).
Possible Solutions for the sampling decision problem
Consider only the parent as guaranteed at creation.
The parent is necessary for traceId propagation already, and inheriting the default sampling decision would normally make sense purely from the parent. We can, by convention, recommend that sampling decisions should only consider this link in the sampling decision, although nothing will prevent implementations from using links if they are available in the API.
Allow late links with some limitations.
The limitations can be clearly documented, such as the inability to consider them for sampling decisions and any future use case that is affected by this behaviour.
Consider mutable sampling decision
This suggestion goes way beyond the scope of this feature, but its worth considering its impact on this requirement.
Consider the ability for tracers to "reconsider" the sampling decision as Span events (in this case any change to the span, e.g. operation_name, attributes, events, links are changed). This ultimately creates further complexities with propagation which may enforce constraints such as changes to sampling properties only takes effect from that point onwards. I'm happy to explore this further in a separate issue.
It seems that during #258, Adding links after Span creation was removed due to complexities around samplers. While I do appreciate the complexity, I wanted to highlight a scenario where this feature is needed and suggest a potential solution for how this can be accommodated, partially addressing the sampling decision issue.
Example Scenario
Consider a system that waits for several events and processes them as a batch, for example for a subscription in a real-time system. These are stored in a
Queueand each event has its OWN injectedSpanContext. These are then consumed in a singleflush()Operations.First event arrives
Queue=[E1(SpanContext1)]and aflush()is scheduled for the SubscriptionSeveral events arrive in the meantime
Queue=[E1(SpanContext1),E2(SpanContext2),E3(SpanContext3)]flush()is invokedA Span gets created for the flushing of the events. In this case, I consider that the first event triggered the flush and hence makes sense to be the parent. So I might do something like:
Let's say the escape condition was triggered after the second message... I expect the following Spans:
Span1(name="flush", parent="SpanContext1", links=["SpanContext1", "SpanContext2"])Span2(name="flush", parent="SpanContext3", links=["SpanContext3"])Additional Notes:
In this case, let's say that our events (E1, E2) are "compressed/conflated" and it no longer makes sense to track them independently.
Also, it is not desired for the queue to be consumed ahead of processing (and for that matter, the escape condition may not yet be known until the processing).
Possible Solutions for the sampling decision problem
Consider only the
parentas guaranteed at creation.The parent is necessary for
traceIdpropagation already, and inheriting the default sampling decision would normally make sense purely from the parent. We can, by convention, recommend that sampling decisions should only consider this link in the sampling decision, although nothing will prevent implementations from using links if they are available in the API.Allow late links with some limitations.
The limitations can be clearly documented, such as the inability to consider them for sampling decisions and any future use case that is affected by this behaviour.
Consider mutable sampling decision
This suggestion goes way beyond the scope of this feature, but its worth considering its impact on this requirement.
Consider the ability for tracers to "reconsider" the sampling decision as Span events (in this case any change to the span, e.g. operation_name, attributes, events, links are changed). This ultimately creates further complexities with propagation which may enforce constraints such as changes to sampling properties only takes effect from that point onwards. I'm happy to explore this further in a separate issue.