Skip to content

Please (re)-allow recording links after Span creation time #454

@eyjohn

Description

@eyjohn

It seems that during #258, Adding links after Span creation was removed due to complexities around samplers. While I do appreciate the complexity, I wanted to highlight a scenario where this feature is needed and suggest a potential solution for how this can be accommodated, partially addressing the sampling decision issue.

Example Scenario

Consider a system that waits for several events and processes them as a batch, for example for a subscription in a real-time system. These are stored in a Queue and each event has its OWN injected SpanContext. These are then consumed in a single flush() Operations.

First event arrives

Queue=[E1(SpanContext1)] and a flush() is scheduled for the Subscription

Several events arrive in the meantime

Queue=[E1(SpanContext1),E2(SpanContext2),E3(SpanContext3)]

flush() is invoked

A Span gets created for the flushing of the events. In this case, I consider that the first event triggered the flush and hence makes sense to be the parent. So I might do something like:

message = queue.pop();
span = tracer.CreateSpan(name="flush", parent=message.span_context)
do {
  do_something(message);
  span.AddLink(message.span_context); // This might not be needed since the first is also parent
  if ( /* early escape condition */ ) { // Lets say downstream is full
    if (!queue.empty()) {
      enqueueNextFlush(); // still messages to flush
    }
    break; 
  }
} while ( message = queue.pop() );

Let's say the escape condition was triggered after the second message... I expect the following Spans:
Span1(name="flush", parent="SpanContext1", links=["SpanContext1", "SpanContext2"])
Span2(name="flush", parent="SpanContext3", links=["SpanContext3"])

Additional Notes:

In this case, let's say that our events (E1, E2) are "compressed/conflated" and it no longer makes sense to track them independently.

Also, it is not desired for the queue to be consumed ahead of processing (and for that matter, the escape condition may not yet be known until the processing).

Possible Solutions for the sampling decision problem

Consider only the parent as guaranteed at creation.

The parent is necessary for traceId propagation already, and inheriting the default sampling decision would normally make sense purely from the parent. We can, by convention, recommend that sampling decisions should only consider this link in the sampling decision, although nothing will prevent implementations from using links if they are available in the API.

Allow late links with some limitations.

The limitations can be clearly documented, such as the inability to consider them for sampling decisions and any future use case that is affected by this behaviour.

Consider mutable sampling decision

This suggestion goes way beyond the scope of this feature, but its worth considering its impact on this requirement.

Consider the ability for tracers to "reconsider" the sampling decision as Span events (in this case any change to the span, e.g. operation_name, attributes, events, links are changed). This ultimately creates further complexities with propagation which may enforce constraints such as changes to sampling properties only takes effect from that point onwards. I'm happy to explore this further in a separate issue.

Metadata

Metadata

Assignees

Labels

area:apiCross language API specification issuearea:samplingRelated to trace samplingarea:span-relationshipsRelated to span relationshipsrelease:after-gaNot required before GA release, and not going to work on before GAspec:traceRelated to the specification/trace directory

Type

No type

Projects

Status

V1 - Stable Semantics

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions