feat(telemetry): add OpenTelemetry instrumentation with Aspire Dashboard support#6629
Draft
Hona wants to merge 227 commits intoanomalyco:devfrom
Draft
feat(telemetry): add OpenTelemetry instrumentation with Aspire Dashboard support#6629Hona wants to merge 227 commits intoanomalyco:devfrom
Hona wants to merge 227 commits intoanomalyco:devfrom
Conversation
…andard attribute names
…r gRPC trace export
Change experimental.openTelemetry config from boolean to union type supporting both boolean and object with enabled/endpoint fields. This allows users to configure custom OTLP endpoints for Aspire Dashboard integration while maintaining backward compatibility with boolean config.
…tion Add telemetry module with: - Config interface and resolveConfig() for endpoint resolution - init() function with NodeSDK, LoggerProvider, trace/log exporters - shutdown() for graceful cleanup - withSpan() helper for span creation with error handling - isEnabled(), getTracer(), getLogger() utility functions - SeverityMap for log level mapping
Integrate OpenTelemetry log emission into the Log module. When telemetry is enabled, all log messages (debug/info/warn/error) are emitted to the OTLP endpoint alongside file-based logging. - Lazy-load telemetry module to avoid circular dependency - Guard against recursive calls during module initialization - Emit logs with proper severity levels using Telemetry.SeverityMap
- Initialize telemetry in yargs middleware after Log.init() - Check OTEL_EXPORTER_OTLP_ENDPOINT env var or config.experimental.openTelemetry - Register SIGTERM and SIGINT handlers for graceful shutdown - Call Telemetry.shutdown() in finally block before process.exit()
…re Dashboard GenAI support Add 150+ spans across the entire distributed architecture: GenAI Features: - gen_ai.system, gen_ai.operation.name, gen_ai.request.model attributes - gen_ai.usage.input_tokens/output_tokens for token tracking - gen_ai.tool.definitions for tool visualization - gen_ai.response.id and gen_ai.response.model - GenAI events: gen_ai.system.message, gen_ai.user.message, gen_ai.assistant.message Distributed Architecture: - Worker lifecycle spans (init, startup, shutdown) - Resource naming: opencode-cli, opencode-worker, opencode-lsp-server - Span linking across thread boundaries with trace context propagation - Messaging spans: messaging.send, messaging.receive, messaging.process - IPC/Named pipe spans for inter-process communication Database Layer: - SQLite operation spans: db.session.insert, db.session.select, db.session.update - Migration spans with statistics - db.operation wrapper for all queries HTTP & Network: - http.request spans for API routes - http.download spans for LSP server downloads (GitHub releases) - archive.extract spans for zip/tar extraction - External dependency visibility Security: - OAuth flow spans: oauth.flow.start, oauth.callback.receive, oauth.token.store - MCP authentication spans - Token validation and refresh tracking Background Operations: - scheduler.task.execute spans - queue.work spans with depth tracking - file.watcher.event spans - server.heartbeat spans LSP Integration: - lsp.request.* spans for all JSON-RPC operations - lsp.server.spawn spans with process details - lsp.notification spans - code.intelligence spans Tool Execution: - tool.websearch.execute with search parameters - tool.webfetch.execute with retry tracking - tool.read.file with binary detection - tool.write.execute with diff statistics - tool.edit.execute with change metrics - tool.list.execute with file counts - tool.apply_patch.execute with hunk statistics - tool.bash.spawn, tool.grep.spawn, tool.git.execute Event System: - event.publish spans - event.handle spans with subscriber counts - bus.subscribe spans - ipc.emit/receive spans TUI Operations: - tui.render spans - tui.input spans - command.execute spans - action.execute spans All spans follow OpenTelemetry semantic conventions and enable full Aspire Dashboard 13.2 GenAI visualization including: - Resource list showing distributed components - GenAI visualizer with message timeline - Tools tab with definitions and parameters - Token usage tracking - Complete trace waterfall across threads
# Conflicts: # .opencode/opencode.jsonc # bun.lock # packages/opencode/package.json # packages/opencode/src/agent/agent.ts # packages/opencode/src/bus/index.ts # packages/opencode/src/cli/cmd/tui/app.tsx # packages/opencode/src/cli/cmd/tui/thread.ts # packages/opencode/src/cli/cmd/tui/worker.ts # packages/opencode/src/file/watcher.ts # packages/opencode/src/flag/flag.ts # packages/opencode/src/index.ts # packages/opencode/src/lsp/client.ts # packages/opencode/src/lsp/index.ts # packages/opencode/src/lsp/server.ts # packages/opencode/src/mcp/auth.ts # packages/opencode/src/plugin/index.ts # packages/opencode/src/scheduler/index.ts # packages/opencode/src/server/server.ts # packages/opencode/src/session/compaction.ts # packages/opencode/src/session/index.ts # packages/opencode/src/session/llm.ts # packages/opencode/src/session/processor.ts # packages/opencode/src/session/prompt.ts # packages/opencode/src/session/summary.ts # packages/opencode/src/shell/shell.ts # packages/opencode/src/snapshot/index.ts # packages/opencode/src/storage/db.ts # packages/opencode/src/storage/json-migration.ts # packages/opencode/src/tool/bash.ts # packages/opencode/src/tool/edit.ts # packages/opencode/src/tool/grep.ts # packages/opencode/src/tool/read.ts # packages/opencode/src/tool/tool.ts # packages/opencode/src/tool/webfetch.ts # packages/opencode/src/tool/write.ts # packages/opencode/src/util/git.ts
…m/dev Re-apply all OpenTelemetry instrumentation on top of upstream dev (1205 commits merged). Key instrumentation: - GenAI attributes on LLM spans (gen_ai.system, gen_ai.operation.name, etc.) - Token usage tracking (gen_ai.usage.input_tokens/output_tokens) - Database spans with db.system: sqlite for Aspire Database filter - HTTP spans with METHOD /route naming - Tool execution wrapper spans (tool.execute) - Individual tool spans (read, write, edit, bash, grep, etc.) - MCP, LSP, OAuth, plugin, snapshot spans - Context propagation fix for proper span nesting - Aspire Dashboard launch scripts and MCP config
…in.trigger chat.params, etc.)
…at span via onFinish, include tool parameters
…uild vs gen_ai.chat title)
…pire tool definitions
… noise, fix test type
… icon, GenAI icon)
…already emitted in onFinish)
Contributor
|
@Hona Saw your presentation at aspire conf yesterday! Was quite keen to try out the Aspire Dashboard for aspire. Would this PR be needed to get this to work? And would this work with the final built artefact as opposed to just with the dev environment for opencode? We would mainly be interested in it to help us track down issues with our installations of opencode |
|
I saw the talk too and really want this 😅 are there plans for this to be merged? |
Contributor
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds experimental OpenTelemetry support for debugging and observability.
What
bun run dev:otelopencode-clivsopencode-serverkey=valuecontext + exception stack tracesEnabling OpenTelemetry
~/.config/opencode/opencode.json:{ "experimental": { "openTelemetry": true } }cd packages/opencode bun run dev:otelThe
OTEL_EXPORTER_OTLP_ENDPOINTenv var controls the endpoint (defaults tohttp://localhost:4317).Images