Skip to content

feat(engine): --env flag + plan preview of classification/mask/retention actions#251

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/env-flag-and-plan-governance-preview
Apr 24, 2026
Merged

feat(engine): --env flag + plan preview of classification/mask/retention actions#251
hugocorreia90 merged 1 commit intomainfrom
feat/env-flag-and-plan-governance-preview

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

Closes two deferred follow-ups on the Wave A / C-1 / C-2 governance surfaces shipped in engine-v1.16.0 (#241, #243, #244):

  1. --env <name> on rocky run and rocky plan — the masking resolver RockyConfig::resolve_mask_for_env(Option<&str>) already accepted an env, but both callers hard-coded None. rocky run --env prod now resolves [mask.prod] overrides over the workspace [mask] defaults during the post-DAG reconcile; rocky plan --env prod previews the same resolution. Matches the flag shape rocky compliance --env already uses.
  2. rocky plan preview of classification / mask / retention actionsPlanOutput gains three additive action-row collections that dry-run what the rocky run post-DAG governance reconcile would apply. Parallel to the existing statements array: statements are warehouse SQL, the new *_actions are control-plane operations (apply_column_tags, apply_masking_policy, apply_retention_policy).

New action-row types

Living alongside PlannedStatement in output.rs:

  • ClassificationAction { model, column, tag } — one row per (model, column, tag) triple from a sidecar [classification] block.
  • MaskAction { model, column, tag, resolved_strategy } — one row per (model, column, tag) where the tag resolves to a strategy under the active env. resolved_strategy is the wire name \"hash\" / \"redact\" / \"partial\" / \"none\". Unresolved tags are intentionally omitted — they're a rocky compliance diagnostic, not a preview row.
  • RetentionAction { model, duration_days, warehouse_preview } — one row per model whose sidecar declares retention = \"<N>[dy]\". warehouse_preview is the warehouse-native rendering Rocky would issue: Databricks emits the paired delta.logRetentionDuration + delta.deletedFileRetentionDuration TBLPROPERTIES; Snowflake emits DATA_RETENTION_TIME_IN_DAYS; BigQuery / DuckDB / other adapters emit null.

All three fields — plus PlanOutput.env — use #[serde(default, skip_serializing_if = ...)]. Projects without governance config produce byte-stable JSON relative to the pre-1.16 shape; live-binary fixture plan.json from 00-playground-default is unchanged.

Role-graph env-scoping decision

Env-invariant — --env does NOT flow into reconcile_role_graph.

Reasoning: rocky.toml has no [role.<env>] override shape (contrast [mask.<env>]). The Wave C-1 config surface is a single top-level [role.*] map by design — roles represent deployment-wide permission groups whose sensitivity doesn't vary per env, while masks do because dev/prod data sensitivity differs. RockyConfig::role_graph(&self) takes no env either. Classification tagging and retention policies are also env-invariant by the same reasoning (no [classification.<env>] / [retention.<env>] shapes) — they preview regardless.

--env usage example

Config:

[mask]
pii = \"hash\"

[mask.prod]
pii = \"redact\"

Sidecar models/users.toml:

[classification]
email = \"pii\"

Default preview resolves piihash:

$ rocky plan --output json | jq '.mask_actions'
[{\"model\": \"users\", \"column\": \"email\", \"tag\": \"pii\", \"resolved_strategy\": \"hash\"}]

With --env prod, the [mask.prod] override wins:

$ rocky plan --env prod --output json | jq '.mask_actions'
[{\"model\": \"users\", \"column\": \"email\", \"tag\": \"pii\", \"resolved_strategy\": \"redact\"}]

rocky run --env prod applies the flipped strategy via apply_masking_policy during the post-DAG reconcile.

Scope notes

  • rocky plan keeps its existing resolve_replication_pipeline(...) gate at the top. Transformation-only projects still error before the governance preview branch runs — matches pre-existing behavior; lifting that gate is out of scope here.
  • The post-DAG governance reconcile at rocky run fires only when a GovernanceAdapter is live (Databricks, Snowflake). On duckdb the governance branch is a no-op regardless of --env.

Tests added

  • rocky-cli (commands/plan.rs::tests):
    • retention_preview_databricks_matches_adapter_sql — asserts the preview string is byte-identical to rocky_core::catalog::generate_set_delta_retention_sql.
    • retention_preview_snowflake_matches_adapter_sql — asserts the ALTER TABLE ... SET DATA_RETENTION_TIME_IN_DAYS = <N> formula.
    • retention_preview_unsupported_adapters_are_none — DuckDB / BigQuery / unknown adapters return None.
    • preview_populates_all_three_action_arrays — tempdir fixture project; asserts classification + mask + retention rows populate, and that --env prod flips resolved_strategy on mask_actions where [mask.prod] overrides the default.
    • preview_skips_mask_row_when_tag_unresolved — confirms unresolved tags omit mask_actions but still populate classification_actions.
    • mask_strategy_wire_names_match_adapter — guards against MaskStrategy::as_str drift.
  • dagster (tests/test_types.py): new test_parse_plan_with_governance parse-guard against PLAN_WITH_GOVERNANCE scenario, plus existing test_parse_plan extended to assert the three new fields default to [] on the minimal shape.
  • vscode: TypeScript types are 100% generated; npm run compile passes against the regenerated bindings.

Fixtures updated

just regen-fixtures against 00-playground-default produced zero diff — the POC has no [classification] / [mask] / retention config, so skip_serializing_if = \"Vec::is_empty\" keeps plan.json byte-stable. The populated case is covered by the rocky-cli integration test above.

Test plan

  • cd engine && cargo test — 263 rocky-cli lib tests + workspace tests all pass
  • cd engine && cargo clippy -- -D warnings — clean
  • cd engine && cargo fmt --check — clean
  • cd integrations/dagster && uv run pytest — 427 tests pass
  • cd integrations/dagster && uv run ruff check src/ tests/ — clean
  • cd editors/vscode && npm run compile — clean against regenerated types
  • just codegen leaves a clean git status — no drift
  • just regen-fixtures on the default POC produces zero diff — byte-stable

…ion actions

Closes the `--env <name>` plumbing gap left over from the 1.16.0 governance
waveplan: `RockyConfig::resolve_mask_for_env(Option<&str>)` already accepted
an env, but `rocky run` / `rocky plan` hard-coded `None`. This wires the flag
through on both commands so `[mask.<env>]` overrides resolve over the
workspace `[mask]` defaults, matching the `--env` shape `rocky compliance`
already uses.

`PlanOutput` gains three additive action-row collections — a dry-run view of
the control-plane governance work the post-DAG reconcile pass in `rocky run`
would do:

- `classification_actions`: `(model, column, tag)` triples from
  `[classification]` sidecars.
- `mask_actions`: `(model, column, tag, resolved_strategy)` where the tag
  resolves under the active env; unresolved tags are a `rocky compliance`
  diagnostic, not a preview row.
- `retention_actions`: models with `retention = "<N>[dy]"` sidecar, carrying
  the parsed `duration_days` + a warehouse-native `warehouse_preview`
  (Databricks renders the Delta TBLPROPERTIES pair; Snowflake renders
  `DATA_RETENTION_TIME_IN_DAYS`; other adapters emit `null`).

All three fields use `skip_serializing_if = "Vec::is_empty"` so existing JSON
consumers on projects without governance config are byte-stable. `PlanOutput.env`
carries the active `--env` under the same treatment.

Role-graph reconcile stays env-invariant. `rocky.toml` has no `[role.<env>]`
override shape (contrast `[mask.<env>]`); roles represent deployment-wide
permission groups while masks vary per env. `--env` therefore does NOT flow
into `reconcile_role_graph`. Classification tagging and retention policies
are also env-invariant by the same reasoning.

Regenerated bindings via `just codegen`:
- `schemas/plan.schema.json`
- `integrations/dagster/src/dagster_rocky/types_generated/plan_schema.py`
- `editors/vscode/src/types/generated/plan.ts`

Dagster `PlanResult` hand-written model picks up the four new fields
(`env`, `classification_actions`, `mask_actions`, `retention_actions`) and
re-exports `ClassificationAction` / `MaskAction` / `RetentionAction` from
the package barrel. New `PLAN_WITH_GOVERNANCE` scenario + `plan_with_governance_json`
fixture + `test_parse_plan_with_governance` parse-guard.

Follow-up of the governance waveplan shipped in engine-v1.16.0 (#241, #243, #244).
@hugocorreia90 hugocorreia90 merged commit b131a9b into main Apr 24, 2026
15 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/env-flag-and-plan-governance-preview branch April 24, 2026 15:27
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant