Rocky is a Rust-based control plane for warehouse SQL pipelines: branches, replay, column-level lineage, compile-time type safety, per-model cost attribution. Storage and compute stay with your warehouse — Databricks, Snowflake, BigQuery, or DuckDB. Apache 2.0.
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/rocky-data/rocky/main/engine/install.sh | bash
# Windows (PowerShell)
irm https://raw.githubusercontent.com/rocky-data/rocky/main/engine/install.ps1 | iexrocky playground my-first-project
cd my-first-project
rocky compile && rocky test && rocky runNo credentials needed — the playground runs end-to-end on local DuckDB.
Each demo below is a self-contained POC in examples/playground/pocs/ — cd in, run ./run.sh, reproduce locally.
A source column type changes upstream. On the next run, Rocky diffs source vs. target, drops the target, and recreates it. No silent data corruption, no dbt-style quiet divergence.
POC — 02-performance/06-schema-drift-recover
Missing required columns, protected columns being removed, or unsafe type changes surface as diagnostic codes (E010, E013) before a single row is written.
POC — 01-quality/01-data-contracts-strict
Create a branch, run against it in an isolated schema, inspect, then drop or promote. Column-level lineage shows the downstream blast radius before you ship.
POC — 00-foundations/06-branches-replay-lineage
Trace a single column from a downstream fact back through its aggregations, all the way to the seed. Blast-radius analysis without reading every model.
POC — 06-developer-experience/01-lineage-column-level
Describe what you want in plain English. Rocky generates a Rocky DSL model, compiles it, and retries on parse failure — the Attempts: 2 line shows the loop catching a first-pass error invisibly.
POC — 03-ai/01-model-generation
Compare two git refs and get a per-changed-column readout of downstream consumers — pre-rendered Markdown drops straight into a GitHub PR comment. CODEOWNERS-style review tooling can't reach this granularity without a compiled engine.
POC — 06-developer-experience/11-lineage-diff
Tag PII columns in the model sidecar; bind tags to mask strategies in [mask] / [mask.<env>]. rocky compliance --env prod --fail-on exception exits 1 the moment a classified column has no resolved strategy — a one-line CI gate against accidentally-unmasked data.
POC — 04-governance/05-classification-masking-compliance
strategy = "incremental" plus a timestamp_column is all it takes. Rocky writes the high-water mark to the embedded state store; subsequent runs only INSERT … WHERE timestamp > watermark. Append 25 rows after a 500-row load — run 2 still finishes in 0.2s.
POC — 02-performance/01-incremental-watermark
| Path | Artifact | Language | Description |
|---|---|---|---|
engine/ |
rocky CLI binary |
Rust | Core SQL transformation engine — 21-crate Cargo workspace |
integrations/dagster/ |
dagster-rocky PyPI wheel |
Python | Dagster resource and component wrapping the Rocky CLI |
editors/vscode/ |
Rocky VSIX | TypeScript | VS Code extension — LSP client + commands for AI features |
examples/playground/ |
(config only) | TOML / SQL | Self-contained DuckDB sample pipeline used for smoke tests and benchmarks |
Each subproject has its own README with detailed usage. The engine/README.md is the canonical product reference for the Rocky CLI.
| Role | Adapter | Status | Notes |
|---|---|---|---|
| Warehouse | Databricks | Production | SQL Statement API · Unity Catalog · SHALLOW CLONE for branches |
| Warehouse | Snowflake | Beta | REST connector · zero-copy CLONE for branches · masking policies |
| Warehouse | BigQuery | Beta | REST connector · CREATE TABLE … COPY for branches |
| Warehouse | DuckDB | Local / Testing | Embedded · powers rocky playground (no credentials needed) |
| Source | Fivetran | Production | REST connector + table discovery |
| Source | Airbyte | Beta | Catalog discovery |
| Source | Iceberg | Beta | REST catalog discovery of namespaces and tables |
| Source | Manual | Production | Schema/table lists inline in rocky.toml |
Building a warehouse Rocky doesn't ship in-tree (ClickHouse, Trino, Redshift, …)? See the Adapter SDK guide and the Rust-native skeleton POC.
git clone https://github.com/rocky-data/rocky.git
cd rocky
just build # builds engine + dagster wheel + vscode extension
just test # runs all test suites
just lint # cargo clippy/fmt + ruff + eslintjust is optional — you can also build each subproject directly. See CONTRIBUTING.md for per-subproject build commands.
Each artifact is released independently using a tag-namespaced scheme:
engine-v*→ Rocky CLI binary (cross-compiled, on GitHub Releases)dagster-v*→dagster-rockywheelvscode-v*→ Rocky VSIX
See CONTRIBUTING.md for the full release flow.
Full documentation: rocky-data.dev — concepts, guides, CLI reference, Dagster integration, adapter SDK.
See CONTRIBUTING.md. Before opening a PR, please read the cross-project change guidance — schema and DSL changes must update consumers atomically.
Rocky is free and open source. If it saves your team time, consider sponsoring the project so development can continue.








