fix(rocky-bigquery): wire alter_column_types drift via BQ-specific overrides by hugocorreia90 · Pull Request #333 · rocky-data/rocky

hugocorreia90 · 2026-05-01T15:22:33Z

Closes the second drift-evolution gap from #328. PR #332 lifted is_safe_type_widening + alter_column_type_sql onto the SqlDialect trait with default impls preserving Databricks/Spark semantics. This PR adds the BigQuery overrides + the runtime wiring to actually emit AlterColumnTypes SQL.

Engine changes

BigQueryDialect::alter_column_type_sql overrides the default ANSI form with BQ's required ALTER COLUMN x SET DATA TYPE y. The default ALTER COLUMN x TYPE y shape returns Expected keyword DROP or keyword SET from BigQuery.
BigQueryDialect::is_safe_type_widening declares a strict BQ-specific allowlist:
- INT64 → NUMERIC (lossless: INT64 fits in NUMERIC precision 38)
- INT64 → BIGNUMERIC (lossless)
- NUMERIC → BIGNUMERIC (strict precision widening)
Excluded by design:
- … → FLOAT64: lossy for absolute values > 2^53. BigQuery accepts via SET DATA TYPE but Rocky's "safe" contract is strict (matches the default Databricks/Spark allowlist that omits INT → FLOAT).
- … → STRING: BigQuery's ALTER COLUMN SET DATA TYPE rejects these with existing column type X is not assignable to STRING even though STRING is lossless at the value level. The default allowlist's "any numeric → STRING" pattern doesn't transfer to BQ. Discovered via live verification — initial draft included these patterns and live runs surfaced the error against stage 3 of the existing smoke test.
run.rs::process_table adds the DriftAction::AlterColumnTypes branch between DropAndRecreate and the existing added_columns branch. Emits via drift::generate_alter_column_sql (which routes through dialect.alter_column_type_sql) and surfaces as action: "alter_column_types" in the run output. If the same drift round also surfaced added columns, both apply before the INSERT continues.

Smoke test

Extends live/drift/run.sh to a four-stage flow:

Initial 4-column source (with score INT64) → replicate clean
ALTER source ADD COLUMN region → add_columns action
DROP+CREATE source with id INT64→STRING → drop_and_recreate (unsafe per BQ allowlist)
DROP+CREATE source with score INT64→NUMERIC → alter_column_types (safe widening)

==> stage 4 drift: action=alter_column_types OK (["column 'score' widened INT64 → NUMERIC"])
==> target customers.score is now NUMERIC (alter_column_types took effect)

Idempotent across consecutive runs.

Test plan

cargo test -p rocky-bigquery --lib — 68 passed (8 new dialect tests)
cargo clippy -p rocky-bigquery -p rocky-cli -p rocky-core --all-targets -- -D warnings clean
cargo fmt -p rocky-bigquery -p rocky-cli -p rocky-core --check clean
live/drift/run.sh against the BQ sandbox — exits 0, all 4 stages pass
Two consecutive runs both pass (idempotency)

…errides Closes the second drift-evolution gap from #328. PR #332 lifted `is_safe_type_widening` + `alter_column_type_sql` onto the `SqlDialect` trait with default impls preserving Databricks/Spark semantics. This PR adds the BigQuery-specific overrides + the runtime wiring to actually emit `AlterColumnTypes` SQL. Engine changes: - `BigQueryDialect::alter_column_type_sql` overrides the default ANSI form with BQ's required `ALTER COLUMN x SET DATA TYPE y`. The default `ALTER COLUMN x TYPE y` shape returns `Expected keyword DROP or keyword SET` from BigQuery. - `BigQueryDialect::is_safe_type_widening` declares a strict BQ-specific allowlist: - `INT64 → NUMERIC` (lossless: INT64 fits in NUMERIC precision 38) - `INT64 → BIGNUMERIC` (lossless) - `NUMERIC → BIGNUMERIC` (strict precision widening) Excluded by design: - `… → FLOAT64`: lossy for absolute values > 2^53. BigQuery accepts via SET DATA TYPE but Rocky's "safe" contract is strict (matches the default Databricks/Spark allowlist that omits `INT → FLOAT`). - `… → STRING`: BigQuery's `ALTER COLUMN SET DATA TYPE` rejects these with `existing column type X is not assignable to STRING` even though STRING is lossless at the value level. The default allowlist's "any numeric → STRING" pattern doesn't transfer to BQ. Discovered via live verification — initial draft included these patterns and live runs surfaced the error. - `run.rs::process_table` adds the `DriftAction::AlterColumnTypes` branch between `DropAndRecreate` and the existing `added_columns` branch. Emits via `drift::generate_alter_column_sql` (which now routes through `dialect.alter_column_type_sql`) and surfaces as `action: "alter_column_types"` in the run output. If the same drift round also surfaced added columns, both apply before the INSERT continues. Smoke test: extends `live/drift/run.sh` to a four-stage flow: 1. Initial 4-column source → replicate clean 2. ALTER source ADD COLUMN region → `add_columns` action 3. DROP+CREATE source with id INT64→STRING → `drop_and_recreate` 4. DROP+CREATE source with score INT64→NUMERIC → `alter_column_types`; target's score column is now NUMERIC

hugocorreia90 merged commit f7b3049 into main May 1, 2026
12 checks passed

hugocorreia90 deleted the fix/bq-alter-column-types branch May 1, 2026 15:32

This was referenced May 1, 2026

docs(rocky-bigquery): conformance audit + recommendation to drop is_experimental #334

Merged

feat(rocky-bigquery): drop is_experimental — adapter promoted #335

Merged

chore: release engine-v1.21.0 + dagster-v1.19.0 #340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(rocky-bigquery): wire alter_column_types drift via BQ-specific overrides#333

fix(rocky-bigquery): wire alter_column_types drift via BQ-specific overrides#333
hugocorreia90 merged 1 commit intomainfrom
fix/bq-alter-column-types

hugocorreia90 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hugocorreia90 commented May 1, 2026

Engine changes

Smoke test

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant