Skip to content

feat(engine): rocky compliance governance rollup (Wave B)#242

Merged
hugocorreia90 merged 3 commits intomainfrom
feat/governance-compliance
Apr 23, 2026
Merged

feat(engine): rocky compliance governance rollup (Wave B)#242
hugocorreia90 merged 3 commits intomainfrom
feat/governance-compliance

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

Wave B of the governance waveplan — a new top-level rocky compliance verb that reports on column classification + masking coverage across a project. Thin rollup over what Wave A (#240/#241) shipped.

Answers the question: Are all classified columns masked wherever policy says they should be?

CLI surface

rocky compliance                       # full report (text or --output json)
rocky compliance --env prod            # scope to one env
rocky compliance --exceptions-only     # only emit the unmasked-where-expected list
rocky compliance --fail-on exception   # CI gate: exit 1 if any exception

Output (typed + codegen-backed)

ComplianceOutput with summary / per_column / exceptions. EnvMaskingStatus reports strategy + enforced per (model, column, env). Derives JsonSchema; Pydantic + TypeScript bindings regenerated and re-exported.

Resolver semantics (design decisions from the PR body)

  • total_masked is counted over (model, column, env) triples — same granularity as total_exceptions.
  • MaskStrategy::None counts as masked (explicit-identity is a policy decision, not a gap).
  • [classifications.allow_unmasked] suppresses ComplianceException emission; per_column rows still report the truth (enforced: false, masking_strategy: "unresolved").
  • Env enumeration: without --env, iterates None (labeled "default") ∪ every [mask.<env>] override key. With --env X, a single Some(X) pass — defaults fallback if [mask.X] is absent.
  • --exceptions-only filters per_column to columns that triggered an exception; exceptions list itself is unfiltered.
  • --fail-on exception exit = 1 if !exceptions.is_empty(), else 0. Default run never fails on report content.

Files touched

  • engine/crates/rocky-cli/src/commands/compliance.rs (new, ~491 LOC incl. 10 unit tests)
  • engine/crates/rocky-cli/src/output.rs — 5 new output types
  • engine/crates/rocky-cli/src/commands/{mod.rs,export_schemas.rs}
  • engine/rocky/src/main.rs — clap Compliance variant + ComplianceFailOn enum
  • schemas/compliance.schema.json (codegen)
  • integrations/dagster/src/dagster_rocky/types_generated/compliance_schema.py (codegen)
  • integrations/dagster/src/dagster_rocky/types_generated/__init__.py + types.py — re-exports + parse_rocky_output route
  • editors/vscode/src/types/generated/{compliance.ts,index.ts} (codegen)

Test plan

  • cargo test --workspace — 2481 passed, 0 failed
  • cargo clippy --workspace --all-targets -- -D warnings — green
  • cargo fmt --check — green
  • uv run pytest tests/test_types.py (dagster) — 23 passed
  • npm run test:unit (vscode) — 20 passed
  • just codegen idempotent (no drift on re-run)
  • Manual CLI smoke tests: all four flag combinations, both output modes, --fail-on exception exit-1 semantics

Adds a top-level CLI verb that answers "are all classified columns
masked wherever policy says they should be?" Pure resolver over
`rocky.toml` + model sidecars — no warehouse I/O.

- New `commands/compliance.rs` with `run_compliance` + `build_report`
  pure resolver + 10 unit tests covering the all-masked / some-exceptions
  / all-allow-listed / --env scoping / --exceptions-only branches.
- New `ComplianceOutput` + supporting structs in `output.rs`. Derives
  `Serialize + Deserialize + JsonSchema + Debug + Clone` per Wave B
  spec (the other *Output structs only derive Serialize + JsonSchema;
  spec is authoritative here).
- `CompactFailOn` clap enum for `--fail-on exception` CI gate.
- Registered `ComplianceOutput` in `export_schemas.rs` so `just codegen`
  emits `schemas/compliance.schema.json`.

CLI surface:
  rocky compliance
  rocky compliance --env prod
  rocky compliance --exceptions-only
  rocky compliance --fail-on exception      # exits 1 on any exception
Output of `just codegen` for the new `rocky compliance` JSON schema:
- `schemas/compliance.schema.json` (the source of truth for this command)
- `integrations/dagster/.../compliance_schema.py` (Pydantic v2 model)
- `editors/vscode/src/types/generated/compliance.ts` (TypeScript interface)

Also updates the two curated barrel files that `just codegen` restores
via `git checkout HEAD --` so the new types stay re-exported after
future codegen runs:
- `integrations/dagster/.../types_generated/__init__.py`
- `editors/vscode/src/types/generated/index.ts`
Wire `rocky compliance` JSON output into the dagster-rocky package:
- Re-export `ComplianceOutput` + its five supporting types from
  `types.py` (the round-9 soft-swap re-export pattern).
- Extend `RockyOutput` union with `ComplianceOutput`.
- Route `command == "compliance"` to `ComplianceOutput` in
  `parse_rocky_output()`.

A `RockyResource.compliance(...)` helper can follow once a consumer
asks for it — the parser side is enough to unblock ad-hoc
`parse_rocky_output(...)` callers today.
@hugocorreia90 hugocorreia90 merged commit a01a4ad into main Apr 23, 2026
15 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/governance-compliance branch April 23, 2026 22:47
hugocorreia90 added a commit that referenced this pull request Apr 23, 2026
* chore: release engine-v1.16.0 + dagster-v1.12.0 + vscode-v1.8.0

Bundles the governance waveplan — five merged PRs (#240 audit trail,
#241 classification + masking, #242 rocky compliance, #243 role-graph,
#244 retention) on top of three FR-004 / state-path follow-ups
(#237 error-path idempotency, #238 state-path unification,
#239 success-path idempotency finalize).

Version bumps: engine 1.15.0 → 1.16.0, dagster-rocky 1.11.0 → 1.12.0,
vscode extension 1.7.0 → 1.8.0.

CHANGELOGs updated for all three artifacts.

* chore(dagster): regen test fixtures for 1.16.0

Fixture drift flagged by CI (`codegen-drift.yml`). Fixtures are captured
from the live engine binary — the version-string bump to 1.16.0 ripples
through every `version` field, and the Wave A audit-trail work (#240)
adds the 8 `RunRecord` fields to `rocky history` output, which the
playground POC now emits.

Regenerated via `just regen-fixtures` against
`examples/playground/pocs/00-foundations/00-playground-default`.

* chore(scripts): sentinel top-level version field in fixture normaliser

Every CLI output's top-level `version` is `env!("CARGO_PKG_VERSION")`
at emit time, so every engine version bump rippled through all 38
captured fixtures — every release PR fought `codegen-drift.yml` until
`just regen-fixtures` was re-run.

Extend the existing `AUDIT_FIELD_SENTINELS` set (Wave A already
sentineled the audit-trail `rocky_version` field + hostname / git
commit / etc.) with the top-level `version` key → `"0.0.0-SENTINEL"`.
After this, version bumps only touch Cargo.toml / pyproject.toml /
package.json / CHANGELOGs — never fixtures.

Regen captured all 38 fixtures; top-level `version` now uniformly
renders as `"0.0.0-SENTINEL"`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant