Skip to content

feat(dagster): auto-surface compliance + retention-status on RockyComponent#249

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/dagster-component-governance-surface
Apr 24, 2026
Merged

feat(dagster): auto-surface compliance + retention-status on RockyComponent#249
hugocorreia90 merged 1 commit intomainfrom
feat/dagster-component-governance-surface

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

@hugocorreia90 hugocorreia90 commented Apr 24, 2026

Summary

Closes the last governance-waveplan observability gap on RockyComponent. After engine-v1.16.0 shipped the Wave B (rocky compliance) and Wave C-2 (rocky retention-status) surfaces and dagster-v1.12.0 shipped the matching Pydantic types, every RockyComponent adopter who wanted compliance/retention events on the asset graph had to hand-roll a wrapper asset that shelled out, parsed JSON, and yielded AssetCheckResult / AssetObservation — duplicating work the component already does for drift + anomalies.

This PR adds two opt-in YAML attributes (surface_compliance, surface_retention_status, both default False — zero behaviour change for deployments that don't flip them). When on, the component auto-wires both surfaces through the existing bridge pattern.

Adopters flip both on with two lines in defs.yaml:

type: dagster_rocky.RockyComponent
attributes:
  binary_path: rocky
  config_path: rocky/rocky.toml
  surface_compliance: true
  surface_retention_status: true

What got added

RockyResource methods (mirror state_health() shape):

  • compliance(env=None) -> ComplianceOutput
  • retention_status(env=None) -> RetentionStatusOutput

observability.py helpers + constants (mirror drift_observations() / anomaly_check_results()):

  • compliance_check_results(output, *, key_resolver) -> Iterator[AssetCheckResult]
  • retention_observations(output, *, key_resolver) -> Iterator[AssetObservation]
  • COMPLIANCE_CHECK_NAME = "compliance_exception"
  • COMPLIANCE_FALLBACK_ASSET_KEY = AssetKey(["_compliance"])
  • RETENTION_OBSERVATION_NAME = "retention_drift"

RockyComponent wiring:

  • Two Pydantic fields: surface_compliance: bool = False, surface_retention_status: bool = False.
  • Pre-declared compliance_exception check spec per asset when opt-in is on (matches the anomaly pattern).
  • _emit_governance_events helper invoked in both execution modes (streaming + pipes).
  • Graceful error swallowing on binary failure (warning log, materialization continues).
  • Invariant guard that drops sentinel-keyed compliance results with a warning rather than crashing Dagster when exceptions target a model outside the component's selection.

Public API re-exports via dagster_rocky/__init__.py for compliance_check_results, retention_observations, the three new constants, ComplianceOutput, RetentionStatusOutput, ComplianceException, ComplianceSummary, ModelRetentionStatus.

Tests (24 new, 417 total after rebase on #248, all green):

  • Parse-guard tests for both output types.
  • Helper-level tests: aggregation per asset, sentinel fallback, empty case, resolver-miss path, forward-compat for warehouse_days.
  • Resource tests: argv shape with/without --env.
  • Component integration tests: pre-declared specs, opt-in gating, end-to-end events alongside drift+anomaly, failure tolerance, unknown-model filtering.

Test plan

  • cd integrations/dagster && uv run pytest — 417 passed
  • uv run ruff check src/ tests/ — clean
  • uv run ruff format --check src/ tests/ — clean
  • Verify in a downstream consumer that flipping surface_compliance: true on their RockyComponent YAML surfaces compliance exceptions on the asset graph without any wrapper code

Deviations from the original design sketch

Several field names and call shapes in the original design sketch diverged from what the shipped schemas actually expose. Deliberate deviations:

  • ComplianceException has no severity field — real fields are column / env / model / reason. The sketch's "passed=True when severity=info/acknowledged, passed=False when severity=warn/error" is unimplementable. Every exception yields passed=False, severity=WARN.
  • ModelRetentionStatus fields differ from the sketch — real fields are configured_days / in_sync / model / warehouse_days. Metadata keys are named accordingly: rocky/retention_configured_days, rocky/retention_warehouse_days, rocky/retention_in_sync. warehouse_days is always None in v1 (schema doc) so the metadata field is omitted conditionally.
  • Helpers take key_resolver: Callable[[str], AssetKey | None], not translator: RockyDagsterTranslator — matches the existing drift_observations() / anomaly_check_results() contract. Simpler signature, and translator.get_asset_key(source, table) doesn't fit a compliance exception (which has model + column, no source/TableInfo).
  • Compliance aggregates per asset, not per exception — Dagster rejects duplicate (asset_key, check_name) pairs per materialization. Multiple exceptions for the same model (typically one per env) fold into one aggregated WARN result; metadata lists every (model, column, env) triple. This is the only deviation from a "one event per input row" helper shape; retention still yields one observation per row.
  • No live-binary fixtures. The integration's fixture convention has moved past JSON files — conftest.py comment spells out that the legacy tests/fixtures/*.json was removed; scenarios now live as Python dicts in scenarios.py and flow through conftest.py via json.dumps. The playground POC also doesn't have governance config, so regen-fixtures couldn't capture a live capture without also changing the POC. Added COMPLIANCE + RETENTION_STATUS dicts to scenarios.py and compliance_json / retention_status_json fixtures to conftest.py — the parse-guard still fires via the new tests.

…ponent

Adds two opt-in YAML attributes (`surface_compliance`, `surface_retention_status`,
both default `False`) that auto-emit governance-waveplan observability on the
Dagster asset graph — closing the glue-code gap every `RockyComponent` adopter
hits after dagster-v1.12.0 shipped the types but not the wiring.

- `RockyResource.compliance()` / `retention_status()` — first-class methods
  mirroring `state_health()`. Both accept an optional `env` kwarg.
- `compliance_check_results()` + `retention_observations()` helpers in
  `observability.py` — pure-function builders matching the
  `drift_observations()` / `anomaly_check_results()` shape. Compliance
  aggregates per asset (Dagster rejects duplicate `(key, check_name)` pairs
  per materialization); retention yields one observation per model row.
- `RockyComponent` pre-declares `compliance_exception` check specs when the
  opt-in is on, invokes the new resource methods once per materialization
  batch, and folds results through the helpers. Binary failures are logged
  and swallowed (same tolerance as drift/anomaly). Sentinel-keyed compliance
  results (for models outside the component's selection) are filtered with a
  warning to preserve Dagster's declared-spec invariant.
- New scenarios + tests cover parse-guard, per-asset aggregation, sentinel
  fallback, undeclared-model filtering, failure tolerance, opt-in gating.

Full test suite: 393 passed (was 369, +24 new).
@hugocorreia90 hugocorreia90 force-pushed the feat/dagster-component-governance-surface branch from a88be13 to a5e7c1b Compare April 24, 2026 11:36
@hugocorreia90 hugocorreia90 merged commit 41c2a58 into main Apr 24, 2026
8 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/dagster-component-governance-surface branch April 24, 2026 11:42
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant