feat: Trust-system Arc 4 (first wave) — rocky trace + feature-gated OTLP metrics#173
Merged
hugocorreia90 merged 1 commit intomainfrom Apr 20, 2026
Merged
Conversation
…TLP metrics export - `rocky trace <run_id|latest> [--model <name>]` — new CLI command that renders a completed run as a Gantt-style timeline from the state store's `RunRecord`. Per-model entries carry `start_offset_ms` relative to run start plus a greedy first-fit `lane` for concurrent models. `TraceOutput` ships with Pydantic + TypeScript bindings so Dagster and custom dashboards can draw the timeline without re-deriving the base timestamp. Sibling to `rocky replay`: replay is the reproducibility artefact (SQL hashes, row counts); trace is the observability artefact (offsets, lanes, durations). - Feature-gated `otel` OTLP metrics export. `rocky-cli` and `rocky` grow an `otel` feature that cascades to the existing (but previously unwired) `rocky_observe::otel::OtelExporter`. New `OtelGuard` RAII wrapper auto-initialises when `OTEL_EXPORTER_OTLP_ENDPOINT` is set and flushes + shuts down on drop, so every `rocky run` exit path (happy, interrupted, error) covers the collector without an explicit cleanup call. Off by default — builds drop the OTel dependency graph entirely. OTel span coverage, freshness SLO enforcement, and the Dagster UI timeline hook are deferred to later Arc 4 waves.
3 tasks
hugocorreia90
added a commit
that referenced
this pull request
Apr 20, 2026
Closes the first wave of every trust-system arc (Arcs 1-7) plus the two wave-2 follow-ups landed the same day. Nine feature PRs since v1.10.0. - Arc 1 (#170): rocky lineage --downstream, rocky branch, rocky run --branch, rocky replay - Arc 2 (#171): per-run cost attribution, [budget] block, budget_breach hook - Arc 3 (#172): three-state CircuitBreaker, adapter consolidation - Arc 4 (#173): rocky trace Gantt + feature-gated OTLP metrics export - Arc 5 (#174): schema-grounded rocky ai prompt + project-aware validator - Arc 6 wave 1 (#184): --target-dialect P001 portability lint (12 constructs) - Arc 7 wave 1 (#185): blast-radius P002 SELECT * lint (semantic-graph aware) - Arc 6 wave 2 (#186): [portability] config block + per-model rocky-allow pragma - Arc 7 wave 2 wave-1 (#187): --with-seed source-schema inference Plus #169 fix: install scripts pick latest engine version by semver. Version bump: 20 Cargo.toml files (all workspace members except rocky-bigquery, which tracks its own version). Wave 2/3 work for every arc remains in the deferred backlog — see the changelog Deferred section for the full carry-forward.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First wave of Arc 4 from the trust-system direction. Lead primitive is user-facing (
rocky trace); OTel metrics export is the SRE-side complement.rocky trace <run_id|latest> [--model <name>]renders a completed run as a Gantt-style timeline from the state store'sRunRecord. Per-model entries carrystart_offset_msrelative to run start plus a greedy first-fitlaneindex so concurrent materializations display on separate rows. JSON output is a newTraceOutputschema; human output is an ASCII timeline with[####....]duration bars. Sibling torocky replay— same inputs, different lens: replay is the reproducibility artefact (SQL hashes, row counts, config hash); trace is the observability artefact (offsets, lanes, durations).otelOTLP metrics export.rocky-cliandrockygrow anotelfeature that cascades to the existing (but previously unwired)rocky_observe::otel::OtelExporter. NewOtelGuardRAII wrapper auto-initialises whenOTEL_EXPORTER_OTLP_ENDPOINTis set and flushes + shuts down on drop — covers every exit path (happy, interrupted, error) without explicit cleanup. Off by default; default builds don't pull the OTLP dependency graph.Deferred to later Arc 4 waves
HookEvent::Before/AfterMaterializesites with a tracer provider so every model execution becomes a trace span).[freshness.sla]block inrocky.toml→sla_breachPipelineEvent +HookEvent::SlaBreach.TraceOutputon the dagster-rocky side to render per-asset Gantt).Test plan
cargo test -p rocky-cli commands::trace— 7 new tests covering lane assignment (sequential + overlapping), "latest" resolution, empty store, missing run, row rendering, truncation.cargo test --workspace— 976 rocky-core + 168 rocky-cli + adapter suites green. Default build (nootelfeature).cargo build -p rocky --features otel— verifies the cross-crate feature cascade builds clean.otelpath compiles but is not exercised in CI; operators opt in by settingOTEL_EXPORTER_OTLP_ENDPOINTat runtime.cargo clippy --workspace --all-targets -- -D warningsclean.cargo fmt --checkclean.just codegen— newtrace.schema.json+ Pydantic + TypeScript regenerated.just regen-fixtures— RunOutput unchanged, fixture corpus stable.uv run pytestinintegrations/dagster/— 307 tests pass.🤖 Generated with Claude Code