fix(rocky-bigquery): wire alter_column_types drift via BQ-specific overrides#333
Merged
hugocorreia90 merged 1 commit intomainfrom May 1, 2026
Merged
fix(rocky-bigquery): wire alter_column_types drift via BQ-specific overrides#333hugocorreia90 merged 1 commit intomainfrom
hugocorreia90 merged 1 commit intomainfrom
Conversation
…errides Closes the second drift-evolution gap from #328. PR #332 lifted `is_safe_type_widening` + `alter_column_type_sql` onto the `SqlDialect` trait with default impls preserving Databricks/Spark semantics. This PR adds the BigQuery-specific overrides + the runtime wiring to actually emit `AlterColumnTypes` SQL. Engine changes: - `BigQueryDialect::alter_column_type_sql` overrides the default ANSI form with BQ's required `ALTER COLUMN x SET DATA TYPE y`. The default `ALTER COLUMN x TYPE y` shape returns `Expected keyword DROP or keyword SET` from BigQuery. - `BigQueryDialect::is_safe_type_widening` declares a strict BQ-specific allowlist: - `INT64 → NUMERIC` (lossless: INT64 fits in NUMERIC precision 38) - `INT64 → BIGNUMERIC` (lossless) - `NUMERIC → BIGNUMERIC` (strict precision widening) Excluded by design: - `… → FLOAT64`: lossy for absolute values > 2^53. BigQuery accepts via SET DATA TYPE but Rocky's "safe" contract is strict (matches the default Databricks/Spark allowlist that omits `INT → FLOAT`). - `… → STRING`: BigQuery's `ALTER COLUMN SET DATA TYPE` rejects these with `existing column type X is not assignable to STRING` even though STRING is lossless at the value level. The default allowlist's "any numeric → STRING" pattern doesn't transfer to BQ. Discovered via live verification — initial draft included these patterns and live runs surfaced the error. - `run.rs::process_table` adds the `DriftAction::AlterColumnTypes` branch between `DropAndRecreate` and the existing `added_columns` branch. Emits via `drift::generate_alter_column_sql` (which now routes through `dialect.alter_column_type_sql`) and surfaces as `action: "alter_column_types"` in the run output. If the same drift round also surfaced added columns, both apply before the INSERT continues. Smoke test: extends `live/drift/run.sh` to a four-stage flow: 1. Initial 4-column source → replicate clean 2. ALTER source ADD COLUMN region → `add_columns` action 3. DROP+CREATE source with id INT64→STRING → `drop_and_recreate` 4. DROP+CREATE source with score INT64→NUMERIC → `alter_column_types`; target's score column is now NUMERIC
This was referenced May 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the second drift-evolution gap from #328. PR #332 lifted
is_safe_type_widening+alter_column_type_sqlonto theSqlDialecttrait with default impls preserving Databricks/Spark semantics. This PR adds the BigQuery overrides + the runtime wiring to actually emitAlterColumnTypesSQL.Engine changes
BigQueryDialect::alter_column_type_sqloverrides the default ANSI form with BQ's requiredALTER COLUMN x SET DATA TYPE y. The defaultALTER COLUMN x TYPE yshape returnsExpected keyword DROP or keyword SETfrom BigQuery.BigQueryDialect::is_safe_type_wideningdeclares a strict BQ-specific allowlist:INT64 → NUMERIC(lossless: INT64 fits in NUMERIC precision 38)INT64 → BIGNUMERIC(lossless)NUMERIC → BIGNUMERIC(strict precision widening)Excluded by design:
… → FLOAT64: lossy for absolute values > 2^53. BigQuery accepts via SET DATA TYPE but Rocky's "safe" contract is strict (matches the default Databricks/Spark allowlist that omitsINT → FLOAT).… → STRING: BigQuery'sALTER COLUMN SET DATA TYPErejects these withexisting column type X is not assignable to STRINGeven though STRING is lossless at the value level. The default allowlist's "any numeric → STRING" pattern doesn't transfer to BQ. Discovered via live verification — initial draft included these patterns and live runs surfaced the error against stage 3 of the existing smoke test.run.rs::process_tableadds theDriftAction::AlterColumnTypesbranch betweenDropAndRecreateand the existingadded_columnsbranch. Emits viadrift::generate_alter_column_sql(which routes throughdialect.alter_column_type_sql) and surfaces asaction: "alter_column_types"in the run output. If the same drift round also surfaced added columns, both apply before the INSERT continues.Smoke test
Extends
live/drift/run.shto a four-stage flow:score INT64) → replicate cleanALTER source ADD COLUMN region→add_columnsactionDROP+CREATE sourcewithid INT64→STRING→drop_and_recreate(unsafe per BQ allowlist)DROP+CREATE sourcewithscore INT64→NUMERIC→alter_column_types(safe widening)Idempotent across consecutive runs.
Test plan
cargo test -p rocky-bigquery --lib— 68 passed (8 new dialect tests)cargo clippy -p rocky-bigquery -p rocky-cli -p rocky-core --all-targets -- -D warningscleancargo fmt -p rocky-bigquery -p rocky-cli -p rocky-core --checkcleanlive/drift/run.shagainst the BQ sandbox — exits 0, all 4 stages pass