Skip to content

fix: SchemaBot migration validation for order-independent comparison#3917

Merged
mdashti merged 6 commits intomainfrom
moe/fix-schemabot-diff-check
Jan 14, 2026
Merged

fix: SchemaBot migration validation for order-independent comparison#3917
mdashti merged 6 commits intomainfrom
moe/fix-schemabot-diff-check

Conversation

@mdashti
Copy link
Copy Markdown
Contributor

@mdashti mdashti commented Jan 13, 2026

Ticket(s) Closed

  • Closes #N/A

What

Rewrote the SchemaBot migration file validation to compare SQL statements in an order-independent way, since pg-schema-diff output order is non-deterministic.

Why

The previous implementation used exact substring matching (grep -Fzo and later a simple Python in check), which:

  1. Failed when pg-schema-diff generated the same functions in a different order
  2. Was unreliable with multiline content matching
  3. Couldn't handle missing end markers or extra workflow output

How

Created .github/scripts/check_migration_diff.py that:

  • Strips all comments - Both -- ... and /* ... */ are removed
  • Strips workflow markers - --- BEGIN/END SUGGESTED UPGRADE SCRIPT ---
  • Strips psql commands - \echo, \quit, etc.
  • Extracts SQL statements - Parses CREATE, ALTER, DROP statements
  • Compares as sets - Order-independent comparison of normalized statements
  • Debug mode - --debug flag shows exactly what statements are being compared

Note: The --debug flag is currently enabled in CI to verify stability of the new comparison logic. It will be removed once we confirm it works reliably.

Tests

Verified manually that the check correctly on #3907:

  • Passes when all schema changes are present (regardless of order)
  • Fails when schema changes are missing

@mdashti mdashti self-assigned this Jan 13, 2026
@mdashti mdashti added cherry-pick/0.23.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.23.x` after it lands. cherry-pick/0.22.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.22.x` after it lands. and removed cherry-pick/0.22.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.22.x` after it lands. labels Jan 13, 2026
Copy link
Copy Markdown
Member

@philippemnoel philippemnoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If say this works then I believe you!

@mdashti mdashti merged commit 70739df into main Jan 14, 2026
20 checks passed
@mdashti mdashti deleted the moe/fix-schemabot-diff-check branch January 14, 2026 02:13
paradedb-bot pushed a commit that referenced this pull request Jan 14, 2026
…3917)

## Ticket(s) Closed

- Closes #N/A

## What

Rewrote the SchemaBot migration file validation to compare SQL
statements in an order-independent way, since `pg-schema-diff` output
order is non-deterministic.

## Why

The previous implementation used exact substring matching (`grep -Fzo`
and later a simple Python `in` check), which:
1. Failed when `pg-schema-diff` generated the same functions in a
different order
2. Was unreliable with multiline content matching
3. Couldn't handle missing end markers or extra workflow output

## How

Created `.github/scripts/check_migration_diff.py` that:
- **Strips all comments** - Both `-- ...` and `/* ... */` are removed
- **Strips workflow markers** - `--- BEGIN/END SUGGESTED UPGRADE SCRIPT
---`
- **Strips psql commands** - `\echo`, `\quit`, etc.
- **Extracts SQL statements** - Parses `CREATE`, `ALTER`, `DROP`
statements
- **Compares as sets** - Order-independent comparison of normalized
statements
- **Debug mode** - `--debug` flag shows exactly what statements are
being compared

**Note:** The `--debug` flag is currently enabled in CI to verify
stability of the new comparison logic. It will be removed once we
confirm it works reliably.

## Tests

Verified manually that the check correctly on
#3907:
- Passes when all schema changes are present (regardless of order)
- Fails when schema changes are missing
mdashti added a commit that referenced this pull request Feb 13, 2026
…3917)

## Ticket(s) Closed

- Closes #N/A

## What

Rewrote the SchemaBot migration file validation to compare SQL
statements in an order-independent way, since `pg-schema-diff` output
order is non-deterministic.

## Why

The previous implementation used exact substring matching (`grep -Fzo`
and later a simple Python `in` check), which:
1. Failed when `pg-schema-diff` generated the same functions in a
different order
2. Was unreliable with multiline content matching
3. Couldn't handle missing end markers or extra workflow output

## How

Created `.github/scripts/check_migration_diff.py` that:
- **Strips all comments** - Both `-- ...` and `/* ... */` are removed
- **Strips workflow markers** - `--- BEGIN/END SUGGESTED UPGRADE SCRIPT
---`
- **Strips psql commands** - `\echo`, `\quit`, etc.
- **Extracts SQL statements** - Parses `CREATE`, `ALTER`, `DROP`
statements
- **Compares as sets** - Order-independent comparison of normalized
statements
- **Debug mode** - `--debug` flag shows exactly what statements are
being compared

**Note:** The `--debug` flag is currently enabled in CI to verify
stability of the new comparison logic. It will be removed once we
confirm it works reliably.

## Tests

Verified manually that the check correctly on
#3907:
- Passes when all schema changes are present (regardless of order)
- Fails when schema changes are missing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick/0.23.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.23.x` after it lands.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants