Skip to content

feat(engine): FR-006 RunRecord audit trail + rocky history --audit#240

Merged
hugocorreia90 merged 3 commits intomainfrom
feat/governance-audit-trail
Apr 23, 2026
Merged

feat(engine): FR-006 RunRecord audit trail + rocky history --audit#240
hugocorreia90 merged 3 commits intomainfrom
feat/governance-audit-trail

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

Adds an 8-field governance audit trail to RunRecord, captured at rocky run claim time and surfaced via rocky history --audit.

Audit fields:

  • triggering_identity (best-effort $USER / $USERNAME)
  • session_source (Cli / Dagster / Lsp / HttpApi — auto-detected via DAGSTER_PIPES_CONTEXT + ROCKY_SESSION_SOURCE)
  • git_commit / git_branch (shells out to system git; None for detached HEAD / non-checkout)
  • idempotency_key (threads existing --idempotency-key from FR-004)
  • target_catalog (resolved adapter catalog)
  • hostname (via hostname crate)
  • rocky_version (env!("CARGO_PKG_VERSION"))

Schema migration: redb v5 → v6 with forward-deserialize — v5 rows deserialize as v6 via #[serde(default)] on each new field. Defaults for required fields: hostname="unknown", rocky_version="<pre-audit>", session_source=Cli. No in-place blob rewrite.

rocky history --audit: new flag expands the 8 fields in the default text table + JSON output. Default rocky history output unchanged (backward compat — tests cover this).

Codegen cascade regenerated: schemas/history.schema.json, integrations/dagster/src/dagster_rocky/types_generated/history_schema.py, editors/vscode/src/types/generated/history.ts.

Test plan

  • cargo test --workspace (green locally)
  • cargo clippy --workspace --all-targets -- -D warnings (green locally)
  • Run rocky run against examples/playground/pocs/00-foundations/00-playground-default, confirm rocky history --audit populates fields
  • Verify rocky history without --audit unchanged
  • Dagster fixture normaliser sentinels hide non-determinism (git commit, hostname, rocky version)

…ize migration

Wave A Agent 2 — governance audit trail per §1.4. RunRecord gains eight
audit fields captured at rocky run claim time:

  triggering_identity: Option<String>
  session_source:      SessionSource  (Cli | Dagster | Lsp | HttpApi)
  git_commit:          Option<String>
  git_branch:          Option<String>
  idempotency_key:     Option<String>
  target_catalog:      Option<String>
  hostname:            String
  rocky_version:       String

Schema version bumps v5 → v6. Existing v5 rows forward-deserialize as
v6 via #[serde(default)] on each new field — no in-place blob rewrite.
Defaults for required fields:
  hostname       = "unknown"
  rocky_version  = "<pre-audit>"
  session_source = SessionSource::Cli

SessionSource variants are serialized as lowercase strings matching the
discriminators a Dagster / LSP caller would stamp via
ROCKY_SESSION_SOURCE. Docs on the type enumerate the env-var contract
the CLI uses for auto-detection (DAGSTER_PIPES_CONTEXT wins, then
ROCKY_SESSION_SOURCE override, else Cli fallback).

Tests: e2e.rs extends an existing RunRecord round-trip with the eight
new fields and verifies a v5-shaped blob round-trips cleanly via the
default-backed deserializer.
…y history --audit

Populates the eight RunRecord audit fields at rocky run claim time and
surfaces them through rocky history. All run-emitting commands (run,
cost, replay, trace) thread a shared AuditContext detected once per
invocation.

rocky-cli::commands::run_audit (new):
  AuditContext with best-effort detectors — none is permitted to fail
  the pipeline. Git info shells out to the system git binary (no gix
  workspace dep; two fixed commands — rev-parse HEAD, symbolic-ref
  --short HEAD). SessionSource auto-detects via DAGSTER_PIPES_CONTEXT,
  then ROCKY_SESSION_SOURCE, then falls back to Cli. Identity reads
  $USER / $USERNAME. Hostname via the hostname crate; rocky_version
  via env!("CARGO_PKG_VERSION").

rocky-cli::commands::{run,cost,replay,trace}: each constructs an
AuditContext at claim time (threading through the existing
--idempotency-key plumbing from FR-004) and stamps it on the emitted
RunRecord.

rocky-cli::commands::history: new --audit flag expands the eight
fields in the default text table and detail view. JSON output always
includes them (consumers opt in by reading the field). Default text
output is unchanged when --audit is not passed — backward compat.

rocky-cli::output: HistoryOutput / ModelHistoryOutput / RunOutput
gain an audit: Option<AuditFieldsOutput> sub-struct. Codegen-driven
schema.

Codegen cascade:
  schemas/history.schema.json           — JSON schema
  types_generated/history_schema.py     — Pydantic v2 models
  editors/vscode/.../history.ts         — TypeScript interfaces

Fixture normaliser (scripts/_normalize_fixture.py): new sentinels for
non-deterministic audit fields — hostname → "sentinel-host",
git_commit → 40-zero SHA, git_branch → "sentinel-branch",
triggering_identity → "sentinel-user", rocky_version → "0.0.0-test".
target_catalog and session_source pass through.

Dependencies: rocky-cli gains hostname = "0.4" (workspace dep) for
portable hostname detection.
- cargo fmt with rustfmt 1.95.0 to absorb the drift flagged by the CI
  rustfmt --check step (history.rs, run_audit.rs, output.rs, state.rs).
- `#[allow(clippy::too_many_arguments)]` on `RunOutput::to_run_record`
  which now takes 8 arguments after the audit bundle landed. Matches
  the existing allow on `execute_models` for the same reason: the
  fields are a natural bundle and shrinking them would hurt
  readability.
@hugocorreia90 hugocorreia90 merged commit 3d57db2 into main Apr 23, 2026
16 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/governance-audit-trail branch April 23, 2026 22:04
hugocorreia90 added a commit that referenced this pull request Apr 23, 2026
Fixture drift flagged by CI (`codegen-drift.yml`). Fixtures are captured
from the live engine binary — the version-string bump to 1.16.0 ripples
through every `version` field, and the Wave A audit-trail work (#240)
adds the 8 `RunRecord` fields to `rocky history` output, which the
playground POC now emits.

Regenerated via `just regen-fixtures` against
`examples/playground/pocs/00-foundations/00-playground-default`.
hugocorreia90 added a commit that referenced this pull request Apr 23, 2026
* chore: release engine-v1.16.0 + dagster-v1.12.0 + vscode-v1.8.0

Bundles the governance waveplan — five merged PRs (#240 audit trail,
#241 classification + masking, #242 rocky compliance, #243 role-graph,
#244 retention) on top of three FR-004 / state-path follow-ups
(#237 error-path idempotency, #238 state-path unification,
#239 success-path idempotency finalize).

Version bumps: engine 1.15.0 → 1.16.0, dagster-rocky 1.11.0 → 1.12.0,
vscode extension 1.7.0 → 1.8.0.

CHANGELOGs updated for all three artifacts.

* chore(dagster): regen test fixtures for 1.16.0

Fixture drift flagged by CI (`codegen-drift.yml`). Fixtures are captured
from the live engine binary — the version-string bump to 1.16.0 ripples
through every `version` field, and the Wave A audit-trail work (#240)
adds the 8 `RunRecord` fields to `rocky history` output, which the
playground POC now emits.

Regenerated via `just regen-fixtures` against
`examples/playground/pocs/00-foundations/00-playground-default`.

* chore(scripts): sentinel top-level version field in fixture normaliser

Every CLI output's top-level `version` is `env!("CARGO_PKG_VERSION")`
at emit time, so every engine version bump rippled through all 38
captured fixtures — every release PR fought `codegen-drift.yml` until
`just regen-fixtures` was re-run.

Extend the existing `AUDIT_FIELD_SENTINELS` set (Wave A already
sentineled the audit-trail `rocky_version` field + hostname / git
commit / etc.) with the top-level `version` key → `"0.0.0-SENTINEL"`.
After this, version bumps only touch Cargo.toml / pyproject.toml /
package.json / CHANGELOGs — never fixtures.

Regen captured all 38 fixtures; top-level `version` now uniformly
renders as `"0.0.0-SENTINEL"`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant