Skip to content

feat: Trust-system Arc 6 (first wave) — dialect-portability linter#184

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/arc-6-polyglot
Apr 20, 2026
Merged

feat: Trust-system Arc 6 (first wave) — dialect-portability linter#184
hugocorreia90 merged 1 commit intomainfrom
feat/arc-6-polyglot

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

  • Opt-in rocky compile --target-dialect <dbx|sf|bq|duckdb> rejects SQL constructs that don't run natively on the chosen warehouse.
  • Detection is AST-based (sqlparser visitor) over the same catalog rocky_sql::transpile already handles — NVL, IFNULL, DATEADD, DATE_ADD, TO_VARCHAR, LEN, CHARINDEX, ARRAY_SIZE, DATE_FORMAT, QUALIFY, ILIKE, FLATTEN.
  • Flagged constructs surface as error-severity P001 diagnostics with a portable-alternative suggestion; the compile bails like any other error path.
  • Wave 1 deliberately omits rocky.toml wiring and per-model pragma opt-out — both deferred to wave 2.

Example

$ rocky compile --target-dialect bq
  ✗ m1
  x error[P001]: NVL is not portable to BigQuery (supported by: Snowflake, Databricks)
   ,-[models/m1.sql:1:1]
 1 | SELECT NVL(a, b) AS c FROM t
   : `-- here
   `----
  help: replace NVL(...) with IFNULL(...) or COALESCE(...)
Error: compilation failed with errors

Test plan

  • New unit tests in rocky-sql/src/portability.rs cover every catalog entry plus false-positive guards (identifier shadowing, string literals, parse failures).
  • New tests in rocky-cli/src/commands/compile.rs verify: P001 diagnostic shape, BigQuery+NVL bails, no --target-dialect → unchanged behavior, Snowflake+NVL accepted.
  • cargo clippy -p rocky-sql -p rocky-cli -p rocky --all-targets -- -D warnings clean.
  • cargo fmt --check clean.
  • just codegen (rust + dagster + vscode) produces no drift.
  • End-to-end CLI smoke: rocky compile --target-dialect bq on a model containing NVL renders the P001 diagnostic with miette source span and exits 1.

…inter

Adds `rocky compile --target-dialect <dbx|sf|bq|duckdb>` that rejects
SQL constructs which don't run on the chosen warehouse. Detection is
AST-based (sqlparser visitor) over the same catalog that
`rocky_sql::transpile` already translates — NVL, IFNULL, DATEADD,
DATE_ADD, TO_VARCHAR, LEN, CHARINDEX, ARRAY_SIZE, DATE_FORMAT, QUALIFY,
ILIKE, FLATTEN. Flagged constructs emit error-severity P001 diagnostics
with a portable-alternative suggestion; the compile bails, matching the
rest of the error path.

Wave 1 deliberately omits rocky.toml wiring and per-model pragma
opt-out — both deferred to wave 2. The flag is fully opt-in: when
unset, compile behaves exactly as before.
@hugocorreia90 hugocorreia90 merged commit ad56dba into main Apr 20, 2026
4 of 5 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/arc-6-polyglot branch April 20, 2026 12:15
hugocorreia90 added a commit that referenced this pull request Apr 20, 2026
- rocky-sql/portability.rs: collapse nested `if !supports_ilike(self.target)`
  into a match-arm guard. Rust 1.95 clippy promotes `collapsible_match` so
  the previous shape (landed in Arc 6 PR #184, pre-toolchain bump) now
  fires under `-D warnings`.

- integrations/dagster: regen `lineage/compile.json` to capture the new
  P002 warning emitted by the lineage POC's stg_orders model
  (`SELECT * FROM source.raw.orders` with a downstream `fct_revenue`
  that references `customer_id, amount`). The fixture proves P002 fires
  end-to-end through the dagster CLI subprocess path with the expected
  shape.
hugocorreia90 added a commit that referenced this pull request Apr 20, 2026
… linter (P002) (#185)

* feat(engine): Trust-system Arc 7 (first wave) — blast-radius SELECT * linter (P002)

Adds always-on, warning-severity P002 diagnostic emitted when a model uses
`SELECT *` AND at least one downstream model references specific columns
of its output. The lint is *blast-radius-aware*: a leaf SELECT * with no
downstream consumers is intentionally not flagged — the user inspected
the result themselves. The diagnostic only fires when an upstream schema
change at the star's source would silently propagate into a downstream
that names columns of this model.

Detection runs against the existing semantic graph rather than re-parsing
SQL: `ModelSchema::has_star` is already set during graph construction by
`rocky_sql::lineage`, and `column_consumers(model, column)` enumerates
the exact downstream edges that make the radius concrete. No new parser
or type-inference machinery is required for this wave; the next wave
(Arc 7 type inference over raw `.sql` models) will sharpen which upstream
is treated as "typed."

Wiring lives in `rocky-compiler::compile_project` (and the incremental
path) so both the CLI's `rocky compile` and the LSP's per-keystroke
publish loop surface P002 to users without any opt-in flag. Severity
is warning — non-blocking on `has_errors`, so existing CI green stays
green; the diagnostic is informational pressure to make schema deps
explicit. Wave 2 may add a `[lints]` block in rocky.toml + per-model
`-- rocky-allow: select-star` pragma, mirroring Arc 6's deferred
portability config.

Diagnostics include the offending model, list of affected downstream
consumers, and the specific columns each downstream references (capped
at 3 per consumer for legibility on wide schemas), with an actionable
suggestion to switch to an explicit column list. Source spans point at
the model file so miette renders the warning in-place in both terminal
and LSP clients.

* fix(engine,dagster): unblock CI on Arc 7 wave 1

- rocky-sql/portability.rs: collapse nested `if !supports_ilike(self.target)`
  into a match-arm guard. Rust 1.95 clippy promotes `collapsible_match` so
  the previous shape (landed in Arc 6 PR #184, pre-toolchain bump) now
  fires under `-D warnings`.

- integrations/dagster: regen `lineage/compile.json` to capture the new
  P002 warning emitted by the lineage POC's stg_orders model
  (`SELECT * FROM source.raw.orders` with a downstream `fct_revenue`
  that references `customer_id, amount`). The fixture proves P002 fires
  end-to-end through the dagster CLI subprocess path with the expected
  shape.

* fix(engine,dagster): match CI rustfmt + refresh fixture wording

- portability.rs: wrap the long ILIKE suggestion string the way CI's
  rustfmt expects. Local rustfmt accepted the inline form but the
  CI toolchain enforces the multiline break.

- lineage/compile.json: regen against a freshly-built release binary
  so the fixture reflects the current P002 message ("silently
  propagates" — tightened from "may silently propagate" mid-PR).
  Previous regen used a stale release binary cached before the
  message edit.
@hugocorreia90 hugocorreia90 mentioned this pull request Apr 20, 2026
3 tasks
hugocorreia90 added a commit that referenced this pull request Apr 20, 2026
Closes the first wave of every trust-system arc (Arcs 1-7) plus the two
wave-2 follow-ups landed the same day. Nine feature PRs since v1.10.0.

- Arc 1 (#170): rocky lineage --downstream, rocky branch, rocky run --branch, rocky replay
- Arc 2 (#171): per-run cost attribution, [budget] block, budget_breach hook
- Arc 3 (#172): three-state CircuitBreaker, adapter consolidation
- Arc 4 (#173): rocky trace Gantt + feature-gated OTLP metrics export
- Arc 5 (#174): schema-grounded rocky ai prompt + project-aware validator
- Arc 6 wave 1 (#184): --target-dialect P001 portability lint (12 constructs)
- Arc 7 wave 1 (#185): blast-radius P002 SELECT * lint (semantic-graph aware)
- Arc 6 wave 2 (#186): [portability] config block + per-model rocky-allow pragma
- Arc 7 wave 2 wave-1 (#187): --with-seed source-schema inference

Plus #169 fix: install scripts pick latest engine version by semver.

Version bump: 20 Cargo.toml files (all workspace members except
rocky-bigquery, which tracks its own version).

Wave 2/3 work for every arc remains in the deferred backlog — see
the changelog Deferred section for the full carry-forward.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant