Skip to content

feat: Trust-system Arc 2 (first wave) — per-run cost attribution + [budget]#171

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/arc-2-cost-attribution
Apr 20, 2026
Merged

feat: Trust-system Arc 2 (first wave) — per-run cost attribution + [budget]#171
hugocorreia90 merged 1 commit intomainfrom
feat/arc-2-cost-attribution

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

First wave of Arc 2 from the trust-system direction — cost attribution + budgets + breach events. Mirrors Arc 1's shape: one bundled PR, run-level only (per-model + per-column deferred).

  • Per-adapter cost helper in rocky-core::costcompute_observed_cost_usd(warehouse, bytes_scanned, duration_ms, dbu_per_hour, cost_per_dbu) picks the right billing formula: Databricks/Snowflake duration × DBU × $/DBU, BigQuery bytes × $6.25/TB, DuckDB zero. WarehouseType::from_adapter_type maps adapter type strings; unbilled sources (fivetran, airbyte) are skipped cleanly.
  • RunOutput.cost_summary — per-model + run totals (total_cost_usd, total_duration_ms, per_model, adapter_type). MaterializationOutput.cost_usd is populated inline so per-asset consumers don't need to re-derive. Populated on every exit path (happy, interrupted, model-only), only enforced on happy + model-only.
  • [budget] block in rocky.tomlmax_usd, max_duration_ms, on_breach = "warn"|"error". Breaches surface on RunOutput.budget_breaches, emit a budget_breach PipelineEvent on the event bus, and fire HookEvent::BudgetBreach so [hook.on_budget_breach] shell hooks / webhooks see it live. on_breach = "error" makes rocky run exit non-zero after printing the final JSON.
  • Pre-execution rocky estimate cost unchanged — this PR covers observed cost only; the existing EstimateOutput.estimated_cost_usd path is untouched.

Deferred to later Arc 2 waves: per-model budgets (need model-sidecar UX decision), adapter-reported bytes_scanned plumbing (Databricks manifest / Snowflake statistics / BigQuery totalBytesProcessed), PR cost-projection GitHub Action, rocky cost historical command.

Test plan

  • cargo test -p rocky-core — 967 tests pass (9 new cost:: + 7 new config::tests::budget* + 17 hook presets unaffected)
  • cargo test -p rocky-cli — 161 tests pass (6 new output::cost_finalize_tests covering populate + check + error-mode)
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • cargo fmt --check clean
  • just codegen — schemas / Pydantic / TypeScript regenerated and committed
  • just regen-fixtures — all 37 fixtures pick up the new cost_summary field
  • uv run pytest in integrations/dagster/ — 307 tests pass
  • npx tsc --noEmit in editors/vscode/ clean

🤖 Generated with Claude Code

…udget] + budget_breach hook

- `RunOutput.cost_summary` + per-materialization `cost_usd`, computed
  post-run via per-adapter formulas (Databricks/Snowflake duration × DBU,
  BigQuery bytes × $/TB, DuckDB zero). Unbilled adapters (fivetran,
  airbyte) are skipped cleanly.
- Declarative `[budget]` block (`max_usd`, `max_duration_ms`,
  `on_breach = "warn"|"error"`) enforced at end of run. Breaches surface
  on `RunOutput.budget_breaches`, emit `budget_breach` on the event bus,
  and fire the new `HookEvent::BudgetBreach` so `[hook.on_budget_breach]`
  shell hooks / webhooks see the event live.
- `rocky run` exits non-zero when `on_breach = "error"` and a limit is
  crossed, after printing the final JSON output.

Per-model budgets, adapter-reported `bytes_scanned` plumbing, and the
PR cost-projection GitHub Action are deferred to later Arc 2 waves.
@hugocorreia90 hugocorreia90 merged commit a691038 into main Apr 20, 2026
15 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/arc-2-cost-attribution branch April 20, 2026 09:49
@hugocorreia90 hugocorreia90 mentioned this pull request Apr 20, 2026
3 tasks
hugocorreia90 added a commit that referenced this pull request Apr 20, 2026
Closes the first wave of every trust-system arc (Arcs 1-7) plus the two
wave-2 follow-ups landed the same day. Nine feature PRs since v1.10.0.

- Arc 1 (#170): rocky lineage --downstream, rocky branch, rocky run --branch, rocky replay
- Arc 2 (#171): per-run cost attribution, [budget] block, budget_breach hook
- Arc 3 (#172): three-state CircuitBreaker, adapter consolidation
- Arc 4 (#173): rocky trace Gantt + feature-gated OTLP metrics export
- Arc 5 (#174): schema-grounded rocky ai prompt + project-aware validator
- Arc 6 wave 1 (#184): --target-dialect P001 portability lint (12 constructs)
- Arc 7 wave 1 (#185): blast-radius P002 SELECT * lint (semantic-graph aware)
- Arc 6 wave 2 (#186): [portability] config block + per-model rocky-allow pragma
- Arc 7 wave 2 wave-1 (#187): --with-seed source-schema inference

Plus #169 fix: install scripts pick latest engine version by semver.

Version bump: 20 Cargo.toml files (all workspace members except
rocky-bigquery, which tracks its own version).

Wave 2/3 work for every arc remains in the deferred backlog — see
the changelog Deferred section for the full carry-forward.
hugocorreia90 added a commit that referenced this pull request Apr 21, 2026
…202)

Adds a new `rocky cost <target>` CLI verb that reads `RunRecord` from
the embedded state store and rolls per-model cost attribution up from
persisted `bytes_scanned`, `bytes_written`, and `duration_ms` values.
Re-uses `rocky_core::cost::compute_observed_cost_usd` — the same
formula `RunOutput::populate_cost_summary` applies at the end of a
live run — so the historical surface stays consistent with the
first-wave per-run summary (PR #171).

The command loads `rocky.toml` to resolve the billed-warehouse type;
when the config can't be read the output degrades gracefully to
`adapter_type: null` / `cost_usd: null`, and durations/bytes are still
emitted from the stored record. BigQuery is a quiet upgrade: because
`ModelExecution.bytes_scanned` is persisted, the historical command
returns a real cost figure for BQ runs even though the live run path
still reports `null` for BQ (adapter bytes-scanned plumbing is a
later wave).

- New `CostOutput` + `PerModelCostHistorical` in `output.rs` (deriving
  `JsonSchema`), registered in `export_schemas.rs::schemas()`.
- New `commands/cost.rs` with `run_cost` entry point and 12 unit tests
  (rollup math, empty-adapter degrade, BQ bytes path, model filter,
  by-id and latest resolution, adapter-default precedence).
- Clap subcommand `Cost { target, model }` in `rocky/src/main.rs`.
- Regenerated `schemas/cost.schema.json`, dagster Pydantic
  (`types_generated/cost_schema.py`), and vscode TypeScript
  (`types/generated/cost.ts`) via `just codegen`.

Scope notes:
- Does NOT extend `ReplayOutput`, `RunOutput`, or any existing Output
  struct — the new surface is isolated to `CostOutput`.
- Does NOT introduce per-model `[budget]` blocks or the PR
  cost-projection Action — those are later waves.
- The write side (`StateStore::record_run` being called from the live
  `rocky run` path) is a pre-existing gap shared with `rocky replay`,
  `rocky trace`, and `rocky history` — out of scope for this PR.
- Dagster resource-method wiring is deferred; the Pydantic type is
  reachable today via `types_generated.cost_schema`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant