feat(engine): Trust-system Arc 7 (first wave) — blast-radius SELECT * linter (P002)#185
Merged
hugocorreia90 merged 3 commits intomainfrom Apr 20, 2026
Merged
feat(engine): Trust-system Arc 7 (first wave) — blast-radius SELECT * linter (P002)#185hugocorreia90 merged 3 commits intomainfrom
hugocorreia90 merged 3 commits intomainfrom
Conversation
… linter (P002) Adds always-on, warning-severity P002 diagnostic emitted when a model uses `SELECT *` AND at least one downstream model references specific columns of its output. The lint is *blast-radius-aware*: a leaf SELECT * with no downstream consumers is intentionally not flagged — the user inspected the result themselves. The diagnostic only fires when an upstream schema change at the star's source would silently propagate into a downstream that names columns of this model. Detection runs against the existing semantic graph rather than re-parsing SQL: `ModelSchema::has_star` is already set during graph construction by `rocky_sql::lineage`, and `column_consumers(model, column)` enumerates the exact downstream edges that make the radius concrete. No new parser or type-inference machinery is required for this wave; the next wave (Arc 7 type inference over raw `.sql` models) will sharpen which upstream is treated as "typed." Wiring lives in `rocky-compiler::compile_project` (and the incremental path) so both the CLI's `rocky compile` and the LSP's per-keystroke publish loop surface P002 to users without any opt-in flag. Severity is warning — non-blocking on `has_errors`, so existing CI green stays green; the diagnostic is informational pressure to make schema deps explicit. Wave 2 may add a `[lints]` block in rocky.toml + per-model `-- rocky-allow: select-star` pragma, mirroring Arc 6's deferred portability config. Diagnostics include the offending model, list of affected downstream consumers, and the specific columns each downstream references (capped at 3 per consumer for legibility on wide schemas), with an actionable suggestion to switch to an explicit column list. Source spans point at the model file so miette renders the warning in-place in both terminal and LSP clients.
- rocky-sql/portability.rs: collapse nested `if !supports_ilike(self.target)` into a match-arm guard. Rust 1.95 clippy promotes `collapsible_match` so the previous shape (landed in Arc 6 PR #184, pre-toolchain bump) now fires under `-D warnings`. - integrations/dagster: regen `lineage/compile.json` to capture the new P002 warning emitted by the lineage POC's stg_orders model (`SELECT * FROM source.raw.orders` with a downstream `fct_revenue` that references `customer_id, amount`). The fixture proves P002 fires end-to-end through the dagster CLI subprocess path with the expected shape.
- portability.rs: wrap the long ILIKE suggestion string the way CI's
rustfmt expects. Local rustfmt accepted the inline form but the
CI toolchain enforces the multiline break.
- lineage/compile.json: regen against a freshly-built release binary
so the fixture reflects the current P002 message ("silently
propagates" — tightened from "may silently propagate" mid-PR).
Previous regen used a stale release binary cached before the
message edit.
3 tasks
hugocorreia90
added a commit
that referenced
this pull request
Apr 20, 2026
Closes the first wave of every trust-system arc (Arcs 1-7) plus the two wave-2 follow-ups landed the same day. Nine feature PRs since v1.10.0. - Arc 1 (#170): rocky lineage --downstream, rocky branch, rocky run --branch, rocky replay - Arc 2 (#171): per-run cost attribution, [budget] block, budget_breach hook - Arc 3 (#172): three-state CircuitBreaker, adapter consolidation - Arc 4 (#173): rocky trace Gantt + feature-gated OTLP metrics export - Arc 5 (#174): schema-grounded rocky ai prompt + project-aware validator - Arc 6 wave 1 (#184): --target-dialect P001 portability lint (12 constructs) - Arc 7 wave 1 (#185): blast-radius P002 SELECT * lint (semantic-graph aware) - Arc 6 wave 2 (#186): [portability] config block + per-model rocky-allow pragma - Arc 7 wave 2 wave-1 (#187): --with-seed source-schema inference Plus #169 fix: install scripts pick latest engine version by semver. Version bump: 20 Cargo.toml files (all workspace members except rocky-bigquery, which tracks its own version). Wave 2/3 work for every arc remains in the deferred backlog — see the changelog Deferred section for the full carry-forward.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SELECT *AND at least one downstream model references specific columns of its output. LeafSELECT *is intentionally not flagged (the user inspected the result themselves).crate::diagnostic::P002const) and the existingSemanticGraph::column_consumersindex — no new parser or type-inference machinery for this wave.rocky-compiler::compile_project(and the incremental path) so bothrocky compileand the LSP per-keystroke loop surface P002 without any opt-in flag.Trust outcome
Trust-system Arc 7 is "SQL as first-class with types." Wave 1 ships the blast-radius-aware linter the arc description promises: when a model M does
SELECT * FROM upstreamand a downstream model D references specific columns of M, an upstream schema change atupstreamsilently propagates through M's star into D's reference scope. The diagnostic names exactly which downstream consumers reference which of M's columns (capped at 3 per consumer for legibility), with an actionable suggestion to switch to an explicit column list.What's deferred to wave 2
Per the trust-direction plan, Arc 7 wave 2 unlocks the heavier work:
.sqlmodels so they hold typed DAG membership equal to.rockymodels. Sharpens P002's notion of "typed upstream" beyond the semantic-graphhas_starheuristic.[lints]block inrocky.toml+ per-model-- rocky-allow: select-starpragma, mirroring Arc 6's deferred portability config.Test plan
blast_radius::testscover the matrix: explicit downstream consumer, leaf, no star, typed external source, untyped external source, multi-downstream aggregation, column-list truncation, suggestion shape.cargo test --workspace— full engine suite passes (no regressions from making P002 always-on).pytestinintegrations/dagster/— all 307 tests pass; fixtures unaffected (playground default POC has noSELECT *).cargo fmt --check+cargo clippy -p rocky-compiler -p rocky-cli --all-targets -- -D warnings.rocky compile --output table) against a 3-model temp project — miette renders the P002 warning with source-span underline andhelp:line; final summary correctly reads0 errors, 1 warnings.diagnostics[]array, no schema change.just codegenproduces no schema or generated-binding drift (P002 is just another diagnostic code string in the existing wire shape).