Skip to content

feat: rocky preview diff + cost — Phase 2/3 substance#280

Merged
hugocorreia90 merged 3 commits intomainfrom
feat/rocky-preview-substance
Apr 29, 2026
Merged

feat: rocky preview diff + cost — Phase 2/3 substance#280
hugocorreia90 merged 3 commits intomainfrom
feat/rocky-preview-substance

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

@hugocorreia90 hugocorreia90 commented Apr 29, 2026

Summary

Lifts rocky preview diff and rocky preview cost from wire-contract stubs (PR #279) to working substance. Pure RunRecord-pair builders that the future Phase 4 GitHub Action composes into a single PR comment.

What's in this PR

Phase 2 substance — rocky preview diff:

  • build_preview_diff(branch, base) — pure pairing of two RunRecords by model name. Computes rows_added / rows_removed via signed delta of rows_affected.
  • Each per-model entry carries sampling_window with coverage = "not_yet_sampled" and coverage_warning = true — honest hedge that row-content changes that don't move row counts won't surface here. Phase 2.5 lifts to checksum-bisection exhaustive sampling (datafold-style) over the rocky_core::compare shadow kernel.
  • render_preview_diff_markdown — deterministic GitHub-flavored Markdown table; the empty-runs path emits a setup hint, not silence.

Phase 3 substance — rocky preview cost:

  • build_preview_cost_delta(branch, base, params) — pure pairing over the union of model names with three cases:
    • Paired (model in both runs) → full bytes/duration/USD delta.
    • Branch-only (new in PR) → delta_usd == branch_cost_usd (base is zero).
    • Base-onlyskipped_via_copy = true, branch_* zero/None, base_cost_usd rolls into summary.savings_from_copy_usd (the cost the PR avoided by copying).
  • Reuses rocky_core::cost::compute_observed_cost_usd — the same formula rocky cost latest uses — so the two surfaces stay consistent.
  • Best-effort cost-param resolution: missing config → durations + bytes ship without USD; delta_usd becomes None rather than wrong.
  • render_preview_cost_markdown — deterministic Markdown with the headline Δ, branch/base totals, copy-savings, and per-model rows.

Branch-vs-base pairing. Scans the most recent 50 RunRecords from the state store; newest with git_branch == branch_name is the branch run, newest with a different git_branch is the base run. Tighter partitioning (state-store branch records carrying a paired base-run pointer) is a follow-up.

Cleanup. Stripped two leftover doc-comment references to a private design-doc path from crates/rocky-cli/src/{output.rs,commands/preview.rs} — those paths aren't accessible to anyone reading the code on GitHub.

What's NOT in this PR (deliberate)

  • Sampled row-content diff (Phase 2.5 lift). Checksum-bisection over rocky_core::compare's shadow kernel; needs warehouse QueryResult plumbing the RunRecord doesn't carry today. The wire field is already populated honestly with coverage_warning = true.
  • Phase 1.5 — copy-step execution (DuckDB CTAS + Phase 5 adapter clones). Independent follow-up.
  • Phase 4 — GitHub Action + auto PR comment. Markdown rendering is now real, so the action becomes a thin orchestrator. Independent follow-up.

Test plan

  • 10 new unit tests on top of existing 14 (24 total in preview::tests):
    • diff_row_delta_signs_correctly — branch > base → +rows; branch < base → −rows.
    • diff_no_change_still_warns_about_coverage — coverage_warning fires even on identical runs.
    • cost_delta_databricks_paired_models — duration-driven delta on paired models.
    • cost_delta_skipped_via_copy_rolls_into_savings — base-only model rolls into savings_from_copy_usd, not total_base_cost_usd.
    • cost_delta_branch_only_model_delta_eq_branch_cost — new-in-PR model: delta == branch cost.
    • cost_delta_no_params_emits_durations_only — missing config → cost None, durations populated.
    • cost_delta_duckdb_always_zero — DuckDB delta is 0.0, not None.
    • diff_markdown_renders_table — Markdown contains headline metrics + per-model rows + Phase 2.5 hedge.
    • diff_markdown_empty_path_explains_setup — no-paired-runs path emits setup hint.
    • cost_markdown_renders_table — Markdown headline + per-model + copy-skip count.
  • cargo test -p rocky-cli --lib — 294 tests pass.
  • cargo clippy -p rocky-cli --all-targets -- -D warnings clean.
  • No schema changes — no codegen-drift risk.

Lifts the rocky preview diff/cost commands from wire-contract stubs to
working substance. Pure RunRecord-pair builders that the Phase 4 GitHub
Action will compose into a single PR comment.

**preview diff (Phase 2 substance):**
- build_preview_diff pairs branch and base RunRecords by model name and
  reports rows_added / rows_removed via signed delta of rows_affected.
- Each per-model entry carries sampling_window with coverage =
  "not_yet_sampled" + coverage_warning = true — honest hedge that row
  content changes that don't move row counts won't surface here.
  Phase 2.5 lifts to checksum-bisection exhaustive sampling
  (datafold-style).
- render_preview_diff_markdown: deterministic GitHub-flavored Markdown.

**preview cost (Phase 3 substance):**
- build_preview_cost_delta pairs branch and base over the union of model
  names. Three cases: paired (full delta), branch-only (delta == branch
  cost — new in PR), base-only (skipped_via_copy = true; base cost rolls
  into savings_from_copy_usd).
- Reuses compute_observed_cost_usd from rocky_core::cost — same formula
  rocky cost latest uses, so the surfaces stay consistent.
- Best-effort cost-param resolution: missing config → durations + bytes
  ship without USD.

Branch-vs-base run pairing: scans the most recent 50 RunRecords from
state store, takes the newest RunRecord with git_branch == branch_name
as branch, newest with a different git_branch as base. Tighter
partitioning lands when state-store branch records carry a paired
base-run pointer.

10 new unit tests covering: signed row-count deltas, copy-savings
roll-up, branch-only model semantics, missing-config path, DuckDB
zero-cost path, Markdown rendering for both surfaces.

Total preview unit tests: 24 (was 14).
Two leftover doc-comment references to a local design-doc path
(`~/Developer/...`) that aren't accessible to anyone reading the
code on GitHub. Replaced the rocky-cli/src/output.rs reference with
a self-contained explanation; removed the rocky-cli/src/commands/
preview.rs reference entirely (the module-level doc already names
the three subcommands).
Re-runs codegen to propagate the path-stripped DiscoverOutput
schemas_cached doc comment from output.rs into the JSON schema +
Pydantic + TypeScript bindings. No behavior change.
@hugocorreia90 hugocorreia90 merged commit cf6fd80 into main Apr 29, 2026
16 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/rocky-preview-substance branch April 29, 2026 11:18
hugocorreia90 added a commit that referenced this pull request Apr 29, 2026
Engine 1.18.0 ships the rocky preview workflow end-to-end (#279, #280,
#281, #282), the [budget].max_bytes_scanned threshold (#288), the
audit-sweep closeout (#283, #285#287, #290#293), and the rocky-server
auth + CORS gate (#291).

Dagster 1.15.0 picks up the regenerated Pydantic models for the rocky
preview surface and ships the P1 cluster (#289) + FR-014 follow-on
(#284).

VS Code 1.10.0 regenerates TypeScript bindings for rocky preview and
RunCostSummary.total_bytes_scanned.

See per-artifact CHANGELOG entries for the full breakdown.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant