[gen-ai] Clarify invoke_agent span creation responsibility in distributed   scenarios

The current
  https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/
  documentation provides guidance on span kind selection (CLIENT vs INTERNAL) but
  does not address who is responsible for creating the invoke_agent span when both
   client and server are instrumented in a distributed scenario.

  Scenario

  Consider a LangGraph agent that:
  1. Runs locally and emits invoke_agent spans with INTERNAL kind (as per the
  spec)
  2. Is now exposed over HTTP as a remote service

  When a remote client calls this agent service:

```mermaid
      sequenceDiagram
          participant Client as Remote Client
          participant Service as Agent Service (LangGraph)
      
          Client->>Service: HTTP request
          activate Service
          Note right of Service: invoke_agent (INTERNAL)
          Note right of Service: └── chat spans
          Note right of Service: └── execute_tool spans
          Service-->>Client: HTTP response
          deactivate Service
```

  Questions

  1. Should the remote client also create an invoke_agent span with CLIENT kind?
  This would result in two invoke_agent spans for the same logical operation.
  2. If yes, how should these spans relate? Should they be linked via trace
  context propagation with parent-child relationship?
  3. If no, who is the canonical owner of the invoke_agent span? The client-side
  SDK or the server-side agent framework?
  4. Should there be a SERVER span kind for invoke_agent? PR open-telemetry/semantic-conventions#2881 discussion
  mentioned SERVER kind for "instrumenting the agent service itself" but this
  doesn't appear in the current spec.

  Current Spec Gaps

  The spec states:
  "Span kind SHOULD be CLIENT and MAY be set to INTERNAL on spans representing
  invocation of agents running in the same process."

  "It's RECOMMENDED to use CLIENT kind when the agent being instrumented usually
  runs in a different process than its caller or when the agent invocation happens
   over instrumented protocol such as HTTP."

  This guidance helps with span kind selection but doesn't address:
  - Coordination between client-side and server-side instrumentation
  - Whether duplicate spans are expected/acceptable
  - Which party is responsible for creating the span in distributed deployments

  Related Issues

  - open-telemetry/semantic-conventions#1315 - Allow INTERNAL GenAI/db spans instead of requiring the kind to be
  CLIENT
  - open-telemetry/semantic-conventions#2881 - Expand invoke_agent span documentation beyond remote agents
  - open-telemetry/semantic-conventions-genai#35 - Semantic Conventions for Generative AI Agentic Systems


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[gen-ai] Clarify invoke_agent span creation responsibility in distributed scenarios #3334

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[gen-ai] Clarify invoke_agent span creation responsibility in distributed scenarios #3334

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions