Skip to content

feat(engine): retention-status --drift warehouse probe (Databricks + Snowflake)#255

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/retention-status-drift-probe
Apr 24, 2026
Merged

feat(engine): retention-status --drift warehouse probe (Databricks + Snowflake)#255
hugocorreia90 merged 1 commit intomainfrom
feat/retention-status-drift-probe

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

Completes the rocky retention-status surface shipped in 1.16.0 (#244) where warehouse_days on ModelRetentionStatus was always None and --drift printed a "deferred to v2" stderr note without touching the warehouse.

  • Before: rocky retention-status --drift was a no-op stub. warehouse_days always None. in_sync collapsed to None == None == true regardless of what the warehouse held. A stderr note warned "probe is deferred to v2."
  • After: --drift actually probes the warehouse and populates warehouse_days on Databricks + Snowflake. in_sync compares the probed value against the sidecar declaration.

SQL

Adapter Probe SQL
Databricks SHOW TBLPROPERTIES <cat>.<schema>.<table> ('delta.logRetentionDuration', 'delta.deletedFileRetentionDuration')
Snowflake SHOW PARAMETERS LIKE 'DATA_RETENTION_TIME_IN_DAYS' IN TABLE <cat>.<schema>.<table>

Both probes read exactly the property Wave C-2 writes via apply_retention_policy, so writes round-trip cleanly. The Databricks parser is tolerant of the three value-string forms seen across Delta runtime versions ("interval 90 days" / "90 days" / "90.000000000 days"); the Snowflake parser accepts both string and numeric JSON value forms.

Trait change

// Additive; existing adapters compile without change.
async fn read_retention_days(&self, _table: &TableRef) -> AdapterResult<Option<u32>> {
    Ok(None)
}

DuckDB + BigQuery keep the default Ok(None)--drift degrades to "no warehouse observation" rather than erroring on those targets.

in_sync computation

Literal comparison: configured_days == warehouse_days. Four cases:

Configured Warehouse in_sync
None None true
Some(N) Some(N) true
Some(A) Some(B), A≠B false
Some(N) None (DuckDB/BigQuery, or probe failed) false
None Some(N) false

Registry change

AdapterRegistry::governance_adapter now resolves Snowflake targets (previously dropped to Noop). SnowflakeGovernanceAdapter<'a> converted to own Arc<SnowflakeConnector> so the registry can return it as Box<dyn GovernanceAdapter>; existing ::new(&connector) call sites in tests migrated to a new ::from_ref shim.

CLI wiring

run_retention_status becomes async and takes --config so it can build the adapter registry. Per-model probe uses [adapter] override when set, falling back to the first pipeline's target adapter. Probe errors surface on stderr per-model but never abort the command.

Test coverage

  • rocky-core: default-impl trait tests (returns Ok(None) on MinimalGovernance + NoopGovernanceAdapter).
  • rocky-databricks: SQL-gen + response-parse unit tests + wiremock round-trip (populated interval form + empty rows).
  • rocky-snowflake: SQL-gen + response-parse unit tests + wiremock round-trip (6-column SHOW PARAMETERS shape + empty rows).
  • rocky-cli: compute_in_sync unit tests pinning all four quadrants; two integration tests using a temporary rocky.toml + DuckDB target — one asserts --drift dispatches cleanly through to Ok(None), one guards the declaration-only path still works after the signature change.

DuckDB + BigQuery

Explicit non-goal. Both keep the default Ok(None) trait impl — --drift on those targets returns warehouse_days = null with no error. Matches the Wave C-2 write-side design where those adapters also fall through on apply_retention_policy.

Test plan

  • cargo test -p rocky-core
  • cargo test -p rocky-databricks (unit) + --features test-support (wiremock)
  • cargo test -p rocky-snowflake (unit) + --features test-support (wiremock)
  • cargo test -p rocky-cli
  • cargo test (full workspace — all green)
  • cargo clippy --all-targets -- -D warnings
  • cargo fmt --check
  • just codegen leaves a clean git status (JSON shape unchanged — warehouse_days: Option<u32> already existed)

Completes the `rocky retention-status` surface shipped in 1.16.0 where
`warehouse_days` on `ModelRetentionStatus` was always `None` and `--drift`
printed a "deferred to v2" stderr note without touching the warehouse.

- `GovernanceAdapter::read_retention_days` (additive, default `Ok(None)`).
- Databricks: `SHOW TBLPROPERTIES ... ('delta.logRetentionDuration',
  'delta.deletedFileRetentionDuration')`; tolerant parser for
  `"interval 90 days"` / `"90 days"` / `"90.000000000 days"` forms.
  Prefers `delta.deletedFileRetentionDuration` on divergence (with warn!).
- Snowflake: `SHOW PARAMETERS LIKE 'DATA_RETENTION_TIME_IN_DAYS' IN TABLE
  ...`; parses string or numeric JSON value forms. Converts
  `SnowflakeGovernanceAdapter<'a>` to own `Arc<SnowflakeConnector>` so the
  adapter registry can dispatch it through `Box<dyn GovernanceAdapter>`.
- Registry: governance_adapter() now resolves Snowflake targets instead
  of dropping to the Noop path.
- CLI: `run_retention_status` is `async` and takes `--config` so it can
  build the registry; per-model probe via `[adapter]` override (falling
  back to the first pipeline's target adapter); probe errors surface on
  stderr per-model but don't abort the command.
- DuckDB + BigQuery keep the default `Ok(None)` — `--drift` degrades to
  "no warehouse observation" rather than erroring.

`in_sync` now compares the probed value against the configured value
instead of trivially collapsing to `None == None`. JSON shape unchanged
(`warehouse_days: Option<u32>` already existed) — `just codegen` leaves a
clean `git status`.
@hugocorreia90 hugocorreia90 force-pushed the feat/retention-status-drift-probe branch from a932741 to f6dbafa Compare April 24, 2026 17:29
@hugocorreia90 hugocorreia90 merged commit 1c7f9e9 into main Apr 24, 2026
13 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/retention-status-drift-probe branch April 24, 2026 17:35
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
Governance-waveplan polish wave on top of v1.16.0/v1.12.0/v1.8.0.

Engine 1.17.0:
- FR-009 BREAKING: reject empty workspace_ids without opt-in (#250)
- --env flag on rocky run / rocky plan + plan preview of classification /
  mask / retention actions (#251)
- Wiremock coverage for apply_column_tags + apply_masking_policy (#252)
- W004 warning for unresolved classification tags (#253)
- SCIM client + per-catalog GRANT for reconcile_role_graph (#254)
- rocky retention-status --drift warehouse probe (#255)

Dagster 1.13.0 (tracks engine 1.17.0):
- Pluggable per-call kwarg resolvers on RockyResource (#248)
- Auto-surface compliance + retention-status on RockyComponent (#249)
- Pre-flight governance_override validator (#250)
- Regenerated PlanResult with env + action preview fields (#251)

VS Code 1.9.0 (tracks engine 1.17.0):
- Regenerated plan.ts with env + 3 governance-action interfaces (#251)

Also refreshes transitive dependencies across all three artifacts:
- Cargo.lock: 14 transitive bumps (rustls v0.23.39, tokio v1.52.1, uuid v1.23.1,
  webpki-roots v1.0.7, compression-codecs v0.4.38, and 9 others)
- uv.lock: 10 transitive bumps (pydantic v2.13.3, dagster-pipes/shared v1.13.2,
  datamodel-code-generator v0.56.1, ruff v0.15.11, and 5 others)
- package-lock.json: transitive-only via npm update; direct deps unchanged so
  the engines.vscode / @types/vscode / test-electron triangle stays in lockstep
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant