fix: prisma migrate deploy failures on pre-existing instances#23655
Conversation
Fixes failed migrations due to idempotent schema changes on pre-existing litellm instances. Problems: 1. P3018 recovery handler never returned True on successful resolution, causing "Database setup failed after multiple retries" even when the final recovery succeeded 2. _roll_back_migration exceptions escaped the P3018 handler, preventing _resolve_specific_migration from running 3. Migration SQL used ADD COLUMN/DROP COLUMN without IF [NOT] EXISTS, failing if schema was already modified Changes: - Add return True after successful P3018 idempotent error recovery - Wrap _roll_back_migration in try/except to allow recovery continuation even if rollback fails - Make migration.sql idempotent with IF NOT EXISTS / IF EXISTS clauses Co-Authored-By: Claude Haiku 4.5 <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes three migration recovery bugs in Key changes:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm-proxy-extras/litellm_proxy_extras/utils.py | Fixes three migration recovery bugs: removes erroneous return True after P3009 resolution (allowing the outer retry loop to re-run prisma migrate deploy for remaining migrations), wraps both _roll_back_migration and _resolve_specific_migration in try/except to prevent exceptions from aborting recovery, and adds subprocess.TimeoutExpired to the caught exception types. Logic is sound; the outer for attempt in range(4) loop will automatically retry after each resolution step. |
| litellm/proxy/proxy_cli.py | Adds --skip_db_migration_check flag to allow startup to continue despite migration failures. Default behavior (exit on failure) is preserved, maintaining backwards compatibility. Contains a misleading warning message that tells the user to pass a flag they have already passed. |
| tests/litellm-proxy-extras/test_litellm_proxy_extras_utils.py | Adds TestMigrationSQLIdempotency class with seven test methods enforcing IF [NOT] EXISTS guards on CREATE TABLE, ADD COLUMN, DROP COLUMN, CREATE INDEX, DROP INDEX, RENAME COLUMN, and ADD/DROP CONSTRAINT across all migration files. Tests run against the full migration corpus, providing strong regression coverage for this fix. |
| litellm-proxy-extras/litellm_proxy_extras/migrations/20260303000000_update_tool_table_policies/migration.sql | RENAME COLUMN "call_policy" TO "input_policy" is correctly wrapped in a DO $$ IF EXISTS block. All ADD COLUMN statements use IF NOT EXISTS. Index operations use IF [NOT] EXISTS. Fully idempotent. |
| litellm-proxy-extras/litellm_proxy_extras/migrations/20250603210143_cascade_budget_changes/migration.sql | Both DROP CONSTRAINT and ADD CONSTRAINT are wrapped in DO $$ blocks with IF EXISTS / IF NOT EXISTS guards. Fully idempotent. |
| litellm-proxy-extras/litellm_proxy_extras/migrations/20250326162113_baseline/migration.sql | Baseline migration updated to use CREATE TABLE IF NOT EXISTS throughout, making re-runs safe on pre-existing schemas. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[setup_database called] --> B{for attempt in range 4}
B --> C[prisma migrate deploy]
C -->|success| D[post-migration sanity check]
D --> E[return True ✅]
C -->|P3009 failed migration| F{idempotent error?}
F -->|yes| G[try: _roll_back_migration]
G -->|CalledProcessError / TimeoutExpired| H[log warning, continue]
G -->|success| I[try: _resolve_specific_migration]
H --> I
I -->|CalledProcessError / TimeoutExpired| J[log warning]
I -->|success| K[log resolved]
J --> B
K --> B
F -->|no| L[mark rolled-back, retry]
L --> B
C -->|P3005 non-empty schema| M[create baseline + resolve all]
M --> E
C -->|P3018 permission error| N[rollback + raise RuntimeError ❌]
C -->|P3018 idempotent| O[try: rollback + resolve]
O --> B
B -->|4 attempts exhausted| P[return False ❌]
Comments Outside Diff (1)
-
litellm/proxy/proxy_cli.py, line 869-872 (link)Warning message is self-contradictory
This code is inside
if skip_db_migration_check:, which means the operator has already passed--skip_db_migration_check. The message incorrectly instructs them to "Pass --skip_db_migration_check to allow this" when they've already done so. Any operator reading this will be confused.
Last reviewed commit: 8e1ed91
| except subprocess.CalledProcessError as rollback_err: | ||
| logger.warning( | ||
| f"Failed to roll back migration {migration_name}: {rollback_err}. " | ||
| f"It may already be in a rolled-back state." | ||
| ) |
There was a problem hiding this comment.
subprocess.TimeoutExpired not caught in rollback handler
_roll_back_migration calls subprocess.run(..., timeout=60, check=True), which can raise either subprocess.CalledProcessError (non-zero exit) or subprocess.TimeoutExpired (if the command takes >60 s on a slow/loaded database). The new try/except only catches CalledProcessError, so a TimeoutExpired will still escape, skip _resolve_specific_migration, and cause the recovery to fail — the exact bug this PR aims to fix.
The equivalent rollback in the permission-error path (line ~494) correctly uses except Exception to guard against all failure modes.
| except subprocess.CalledProcessError as rollback_err: | |
| logger.warning( | |
| f"Failed to roll back migration {migration_name}: {rollback_err}. " | |
| f"It may already be in a rolled-back state." | |
| ) | |
| except (subprocess.CalledProcessError, subprocess.TimeoutExpired) as rollback_err: | |
| logger.warning( | |
| f"Failed to roll back migration {migration_name}: {rollback_err}. " | |
| f"It may already be in a rolled-back state." | |
| ) |
Adds TestMigrationSQLIdempotency test class that statically validates all migration SQL files created after 2026-03-11 use idempotent DDL: - ADD COLUMN must use IF NOT EXISTS - DROP COLUMN must use IF EXISTS - DROP INDEX must use IF EXISTS - CREATE INDEX must use IF NOT EXISTS This prevents the class of errors where prisma migrate deploy fails on pre-existing instances because the schema was already modified. Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c14b574737
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| migration_name | ||
| ) | ||
| logger.info("✅ Migration resolved.") | ||
| return True |
There was a problem hiding this comment.
Continue migration deploy after resolving one failed step
Returning True immediately after _resolve_specific_migration() marks only the currently failing migration as applied and exits setup_database, so any later pending migrations are never deployed in this startup run. This occurs when prisma migrate deploy hits an idempotent P3018 on migration N but N+1+ migrations are still pending (common on version-skipping upgrades), causing startup to report success while the schema remains partially migrated.
Useful? React with 👍 / 👎.
_roll_back_migration uses subprocess.run with timeout=60, so it can raise subprocess.TimeoutExpired in addition to CalledProcessError. Without catching this, a slow database during rollback would escape the handler and bypass _resolve_specific_migration — the same class of bug. Co-Authored-By: Claude Opus 4.6 <[email protected]>
| # Non-idempotent DDL patterns that should use IF [NOT] EXISTS | ||
| _UNSAFE_PATTERNS = [ | ||
| # ADD COLUMN without IF NOT EXISTS | ||
| ( | ||
| r"ADD\s+COLUMN\s+(?!IF\s+NOT\s+EXISTS)", | ||
| "ADD COLUMN without IF NOT EXISTS", | ||
| ), | ||
| # DROP COLUMN without IF EXISTS | ||
| ( | ||
| r"DROP\s+COLUMN\s+(?!IF\s+EXISTS)", | ||
| "DROP COLUMN without IF EXISTS", | ||
| ), | ||
| # DROP INDEX without IF EXISTS (standalone statement, not inside ALTER TABLE) | ||
| ( | ||
| r"DROP\s+INDEX\s+(?!IF\s+EXISTS)(?!.*ON)", | ||
| "DROP INDEX without IF EXISTS", | ||
| ), | ||
| # CREATE INDEX without IF NOT EXISTS | ||
| ( | ||
| r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+(?!IF\s+NOT\s+EXISTS)(?!CONCURRENTLY)", | ||
| "CREATE INDEX without IF NOT EXISTS", | ||
| ), | ||
| ] |
There was a problem hiding this comment.
_UNSAFE_PATTERNS is dead code
The _UNSAFE_PATTERNS list is defined here but never referenced anywhere — each of the four TestMigrationSQLIdempotency test methods re-implements its own inline regex check independently. The variable serves no purpose and can mislead future maintainers into thinking it drives the tests.
Additionally, there's a subtle inconsistency: the DROP INDEX entry uses (?!.*ON) to try to exclude MySQL-style DROP INDEX name ON table syntax, and the CREATE INDEX entry uses (?!CONCURRENTLY) to skip CONCURRENTLY indexes — but neither of these exclusions appears in the corresponding test methods, meaning _UNSAFE_PATTERNS and the actual enforcement logic are already diverged.
Either delete the variable, or replace the four duplicate inline-regex blocks in the test methods with a single parametrized test driven by _UNSAFE_PATTERNS.
| @pytest.fixture(scope="class") | ||
| def recent_migrations(self): | ||
| migrations = _get_recent_migrations() | ||
| assert len(migrations) > 0, ( | ||
| f"No migrations found at or after cutoff {_IDEMPOTENT_SQL_CUTOFF}. " | ||
| f"Check that _MIGRATIONS_DIR ({_MIGRATIONS_DIR}) is correct." | ||
| ) | ||
| return migrations |
There was a problem hiding this comment.
Hard assert in fixture will fail instead of skip when path is wrong
The recent_migrations fixture raises AssertionError (i.e., a test failure, not a skip) when no migration files are found. If _MIGRATIONS_DIR is incorrect in some CI environment — or the migrations directory is cleaned — all four tests will be reported as failures rather than skipped/xfailed, which makes it harder to distinguish a configuration problem from a real regression.
Consider using pytest.skip when the directory is missing or empty:
| @pytest.fixture(scope="class") | |
| def recent_migrations(self): | |
| migrations = _get_recent_migrations() | |
| assert len(migrations) > 0, ( | |
| f"No migrations found at or after cutoff {_IDEMPOTENT_SQL_CUTOFF}. " | |
| f"Check that _MIGRATIONS_DIR ({_MIGRATIONS_DIR}) is correct." | |
| ) | |
| return migrations | |
| @pytest.fixture(scope="class") | |
| def recent_migrations(self): | |
| migrations = _get_recent_migrations() | |
| if len(migrations) == 0: | |
| pytest.skip( | |
| f"No migrations found at or after cutoff {_IDEMPOTENT_SQL_CUTOFF}. " | |
| f"Check that _MIGRATIONS_DIR ({_MIGRATIONS_DIR}) is correct." | |
| ) | |
| return migrations |
Fixed all existing migration files to use IF [NOT] EXISTS for DDL statements (ADD COLUMN, DROP COLUMN, DROP INDEX, CREATE INDEX). Removed the date cutoff from the idempotency tests so they now validate all migrations, not just recent ones. Co-Authored-By: Claude Opus 4.6 <[email protected]>
| @@ -2,19 +2,19 @@ | |||
| ALTER TABLE "LiteLLM_ToolTable" RENAME COLUMN "call_policy" TO "input_policy"; | |||
There was a problem hiding this comment.
RENAME COLUMN is not idempotent on pre-existing instances
This PR makes every ADD COLUMN, DROP COLUMN, and index DDL statement idempotent with IF [NOT] EXISTS, but the bare RENAME COLUMN "call_policy" TO "input_policy" on line 2 is left unguarded. On any pre-existing instance where call_policy was already renamed (e.g., the migration ran once and was then re-attempted), PostgreSQL raises ERROR: column "call_policy" does not exist, which would trip the same failure mode this PR is fixing.
The migration 20250806095134_rename_alias_to_server_name_mcp_table already shows the correct pattern:
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name = 'LiteLLM_ToolTable' AND column_name = 'call_policy') THEN
ALTER TABLE "LiteLLM_ToolTable" RENAME COLUMN "call_policy" TO "input_policy";
END IF;
END $$;This guards the rename with an existence check so re-running it is safe.
| def all_migrations(self): | ||
| migrations = _get_all_migrations() | ||
| assert len(migrations) > 0, ( | ||
| f"No migrations found. " | ||
| f"Check that _MIGRATIONS_DIR ({_MIGRATIONS_DIR}) is correct." | ||
| ) | ||
| return migrations | ||
|
|
||
| def test_add_column_uses_if_not_exists(self, all_migrations): | ||
| """ADD COLUMN statements must use IF NOT EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"ADD\s+COLUMN\s+", line, re.IGNORECASE) and not re.search( | ||
| r"ADD\s+COLUMN\s+IF\s+NOT\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "ADD COLUMN without IF NOT EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_drop_column_uses_if_exists(self, all_migrations): | ||
| """DROP COLUMN statements must use IF EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"DROP\s+COLUMN\s+", line, re.IGNORECASE) and not re.search( | ||
| r"DROP\s+COLUMN\s+IF\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "DROP COLUMN without IF EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_drop_index_uses_if_exists(self, all_migrations): | ||
| """DROP INDEX statements must use IF EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"DROP\s+INDEX\s+", line, re.IGNORECASE) and not re.search( | ||
| r"DROP\s+INDEX\s+IF\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "DROP INDEX without IF EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_create_index_uses_if_not_exists(self, all_migrations): | ||
| """CREATE INDEX statements must use IF NOT EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search( | ||
| r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+", line, re.IGNORECASE | ||
| ) and not re.search( | ||
| r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+(?:CONCURRENTLY\s+)?IF\s+NOT\s+EXISTS", | ||
| line, | ||
| re.IGNORECASE, | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "CREATE INDEX without IF NOT EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) |
There was a problem hiding this comment.
TestMigrationSQLIdempotency has no coverage for RENAME COLUMN
The four test methods enforce idempotency for ADD COLUMN, DROP COLUMN, DROP INDEX, and CREATE INDEX, but there is no test for bare RENAME COLUMN statements — which are also non-idempotent. The migration 20260303000000_update_tool_table_policies contains an unguarded RENAME COLUMN that would fail on a pre-existing instance.
Adding a fifth method would catch regressions of this class:
def test_rename_column_uses_do_block(self, all_migrations):
"""RENAME COLUMN must be wrapped in a conditional DO $$ block"""
violations = []
for migration_name, sql in all_migrations:
for line_num, line in enumerate(sql.splitlines(), 1):
if re.search(r"RENAME\s+COLUMN\s+", line, re.IGNORECASE):
# Bare RENAME COLUMN outside a DO block is not idempotent
# Check that it appears inside a DO $$ ... END $$ block
if not re.search(r"^\s*(IF\s+EXISTS|DO\s+\$)", line, re.IGNORECASE):
violations.append(f" {migration_name}:{line_num}: {line.strip()}")
assert not violations, (
"RENAME COLUMN outside a conditional DO $$ block found in migrations:\n"
+ "\n".join(violations)
)…gration flag By default the proxy now warns and continues when database migration fails. Pass --require_db_migration (or set REQUIRE_DB_MIGRATION=true) to restore the previous behavior of exiting with an error. Co-Authored-By: Claude Opus 4.6 <[email protected]>
| if not PrismaManager.setup_database( | ||
| use_migrate=not use_prisma_db_push | ||
| ): | ||
| print( # noqa | ||
| "\033[1;31mLiteLLM Proxy: Database setup failed after multiple retries. " | ||
| "The proxy cannot start safely. Please check your database connection and migration status.\033[0m" | ||
| ) | ||
| sys.exit(1) | ||
| if require_db_migration: | ||
| print( # noqa | ||
| "\033[1;31mLiteLLM Proxy: Database setup failed after multiple retries. " | ||
| "The proxy cannot start safely. Please check your database connection and migration status.\033[0m" | ||
| ) | ||
| sys.exit(1) | ||
| else: | ||
| print( # noqa | ||
| "\033[1;33mLiteLLM Proxy: Database migration failed but continuing startup. " | ||
| "Use --require_db_migration to exit on migration failure.\033[0m" | ||
| ) |
There was a problem hiding this comment.
Default behavior changed from fail-safe to fail-open
The proxy previously exited with code 1 whenever setup_database() returned False. This PR flips the default: migration failures are now silently swallowed and startup continues unless the operator explicitly passes --require_db_migration.
This is a backwards-incompatible behavior change that could leave production deployments in a broken state:
- The database schema may be incomplete (missing columns, tables, or indexes that the application code expects)
- The proxy starts up and serves traffic, but runtime errors then surface for requests that hit the affected schema objects
- Automated deployments that relied on the old fail-fast behavior to catch migration problems now silently proceed
Operators who aren't aware of this flag change won't know to add --require_db_migration to their startup command. Consider keeping the old fail-fast behavior as the default and instead adding a --ignore_db_migration_failure flag for operators who explicitly want the lenient behavior.
Rule Used: What: avoid backwards-incompatible changes without... (source)
…UMN and ADD CONSTRAINT Three fixes: 1. _resolve_specific_migration in the P3018 handler was not wrapped in try/except, so failures there would bypass the return True and propagate unexpectedly — partially defeating the rollback fix. 2. Bare RENAME COLUMN in 20260303000000_update_tool_table_policies was non-idempotent. Wrapped in DO $$ IF EXISTS block. Also wrapped all 28 bare ADD CONSTRAINT statements across 9 migration files in DO $$ IF NOT EXISTS (pg_constraint) blocks. 3. Added test_rename_column_is_guarded and test_add_constraint_is_guarded to TestMigrationSQLIdempotency for full DDL coverage. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
@greptile please re-review |
| migration_name | ||
| ) | ||
| logger.info("✅ Migration resolved.") | ||
| return True |
There was a problem hiding this comment.
return True skips remaining pending migrations
After resolving a single idempotent migration, the code returns True immediately. But _resolve_specific_migration only marks that one migration as applied — it does not apply any other pending migrations. If there are N migrations pending, resolving one and returning True causes the proxy to start up thinking the schema is fully up-to-date, when migrations 2…N have never actually run.
continue would be the correct replacement: it lets the for attempt in range(3) loop execute another prisma migrate deploy, which will now skip the resolved migration and proceed with the remaining ones:
| return True | |
| return True |
should instead be:
logger.info("✅ Migration resolved, retrying migrate deploy for remaining migrations.")
# break inner error-handling and retry the outer migrate deploy loop
break # or continue the outer loopMore concretely, replacing return True with continue (if the outer for attempt in range(3) loop is accessible) or restructuring so the outer loop retries would ensure all pending migrations are applied before the function signals success.
| @@ -2,5 +2,10 @@ | |||
| ALTER TABLE "LiteLLM_TeamMembership" DROP CONSTRAINT "LiteLLM_TeamMembership_budget_id_fkey"; | |||
There was a problem hiding this comment.
Non-idempotent DROP CONSTRAINT left unguarded
This PR wraps ADD CONSTRAINT in DO $$ IF NOT EXISTS blocks for idempotency, but the DROP CONSTRAINT directly above (line 2) was not updated. If this migration partially executes — e.g., the DROP CONSTRAINT succeeds but something fails before the migration is marked applied — then any retry attempt will fail with ERROR: constraint "LiteLLM_TeamMembership_budget_id_fkey" of relation "LiteLLM_TeamMembership" does not exist.
The same issue exists in 20250625145206_cascade_budget_and_loosen_managed_file_json/migration.sql:2.
The fix is to wrap the drop in a DO $$ block with an existence check, mirroring the pattern used for the re-add:
-- DropForeignKey
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'LiteLLM_TeamMembership_budget_id_fkey') THEN
ALTER TABLE "LiteLLM_TeamMembership" DROP CONSTRAINT "LiteLLM_TeamMembership_budget_id_fkey";
END IF;
END $$;This matches the same pg_constraint lookup used for the idempotent ADD CONSTRAINT block below it.
| def all_migrations(self): | ||
| migrations = _get_all_migrations() | ||
| assert len(migrations) > 0, ( | ||
| f"No migrations found. " | ||
| f"Check that _MIGRATIONS_DIR ({_MIGRATIONS_DIR}) is correct." | ||
| ) | ||
| return migrations | ||
|
|
||
| def test_add_column_uses_if_not_exists(self, all_migrations): | ||
| """ADD COLUMN statements must use IF NOT EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"ADD\s+COLUMN\s+", line, re.IGNORECASE) and not re.search( | ||
| r"ADD\s+COLUMN\s+IF\s+NOT\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "ADD COLUMN without IF NOT EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_drop_column_uses_if_exists(self, all_migrations): | ||
| """DROP COLUMN statements must use IF EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"DROP\s+COLUMN\s+", line, re.IGNORECASE) and not re.search( | ||
| r"DROP\s+COLUMN\s+IF\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "DROP COLUMN without IF EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_drop_index_uses_if_exists(self, all_migrations): | ||
| """DROP INDEX statements must use IF EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search(r"DROP\s+INDEX\s+", line, re.IGNORECASE) and not re.search( | ||
| r"DROP\s+INDEX\s+IF\s+EXISTS", line, re.IGNORECASE | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "DROP INDEX without IF EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_create_index_uses_if_not_exists(self, all_migrations): | ||
| """CREATE INDEX statements must use IF NOT EXISTS""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| for line_num, line in enumerate(sql.splitlines(), 1): | ||
| if re.search( | ||
| r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+", line, re.IGNORECASE | ||
| ) and not re.search( | ||
| r"CREATE\s+(?:UNIQUE\s+)?INDEX\s+(?:CONCURRENTLY\s+)?IF\s+NOT\s+EXISTS", | ||
| line, | ||
| re.IGNORECASE, | ||
| ): | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "CREATE INDEX without IF NOT EXISTS found in recent migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_rename_column_is_guarded(self, all_migrations): | ||
| """RENAME COLUMN must be inside a DO $$ IF EXISTS block""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| lines = sql.splitlines() | ||
| in_do_block = False | ||
| for line_num, line in enumerate(lines, 1): | ||
| if re.search(r"DO\s+\$\$", line, re.IGNORECASE): | ||
| in_do_block = True | ||
| if re.search(r"END\s+\$\$", line, re.IGNORECASE): | ||
| in_do_block = False | ||
| if re.search(r"RENAME\s+COLUMN\s+", line, re.IGNORECASE) and not in_do_block: | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "RENAME COLUMN without DO $$ IF EXISTS guard found in migrations:\n" | ||
| + "\n".join(violations) | ||
| ) | ||
|
|
||
| def test_add_constraint_is_guarded(self, all_migrations): | ||
| """ADD CONSTRAINT must be inside a DO $$ IF NOT EXISTS block""" | ||
| violations = [] | ||
| for migration_name, sql in all_migrations: | ||
| lines = sql.splitlines() | ||
| in_do_block = False | ||
| for line_num, line in enumerate(lines, 1): | ||
| if re.search(r"DO\s+\$\$", line, re.IGNORECASE): | ||
| in_do_block = True | ||
| if re.search(r"END\s+\$\$", line, re.IGNORECASE): | ||
| in_do_block = False | ||
| if re.search(r"ADD\s+CONSTRAINT\s+", line, re.IGNORECASE) and not in_do_block: | ||
| violations.append(f" {migration_name}:{line_num}: {line.strip()}") | ||
| assert not violations, ( | ||
| "ADD CONSTRAINT without DO $$ IF NOT EXISTS guard found in migrations:\n" |
There was a problem hiding this comment.
No test coverage for DROP CONSTRAINT idempotency
TestMigrationSQLIdempotency now covers ADD COLUMN, DROP COLUMN, DROP INDEX, CREATE INDEX, RENAME COLUMN, and ADD CONSTRAINT, but there is no test for bare DROP CONSTRAINT without an IF EXISTS guard or a DO $$ block wrapper. Two migration files changed in this PR — 20250603210143_cascade_budget_changes and 20250625145206_cascade_budget_and_loosen_managed_file_json — contain unguarded ALTER TABLE ... DROP CONSTRAINT ... statements that a new test method would immediately catch:
def test_drop_constraint_is_guarded(self, all_migrations):
"""DROP CONSTRAINT must use IF EXISTS or be wrapped in a DO $$ block"""
violations = []
for migration_name, sql in all_migrations:
lines = sql.splitlines()
in_do_block = False
for line_num, line in enumerate(lines, 1):
if re.search(r"DO\s+\$\$", line, re.IGNORECASE):
in_do_block = True
if re.search(r"END\s+\$\$", line, re.IGNORECASE):
in_do_block = False
if (
re.search(r"DROP\s+CONSTRAINT\s+", line, re.IGNORECASE)
and not re.search(r"DROP\s+CONSTRAINT\s+IF\s+EXISTS", line, re.IGNORECASE)
and not in_do_block
):
violations.append(f" {migration_name}:{line_num}: {line.strip()}")
assert not violations, (
"DROP CONSTRAINT without IF EXISTS or DO $$ guard found in migrations:\n"
+ "\n".join(violations)
)Three fixes: 1. Both P3009 and P3018 idempotent handlers returned True after resolving a single migration, exiting before remaining pending migrations were applied. Now they continue the retry loop so prisma migrate deploy runs again for any remaining migrations. 2. Two migration files had bare DROP CONSTRAINT without a DO $$ IF EXISTS guard, which fails if the constraint was already dropped. Wrapped both in idempotent DO $$ blocks. 3. Added test_drop_constraint_is_guarded to catch unguarded DROP CONSTRAINT in future migrations. Co-Authored-By: Claude Opus 4.6 <[email protected]>
…default Four fixes: 1. P3009 idempotent handler now has the same try/except around _roll_back_migration and _resolve_specific_migration as the P3018 handler. Previously a rollback or resolve failure in the P3009 path would propagate and leave the migration unresolved. 2. Added IF NOT EXISTS to all 57 bare CREATE TABLE statements across 34 migration files. Added test_create_table_uses_if_not_exists to catch this pattern. 3. Reverted the backwards-incompatible default behavior change: the proxy now fails fast on migration failure (original behavior). Added --skip_db_migration_check / SKIP_DB_MIGRATION_CHECK to opt into warn-and-continue instead. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Summary
Fixes failed migrations when running
prisma migrate deployon pre-existing litellm instances. The migration recovery logic had multiple bugs that caused it to fail silently even after successfully resolving idempotent errors.Root Causes
True, causing the setup to report failure even when the final recovery attempt succeeded_roll_back_migrationthrew exceptions that escaped the P3018 handler, preventing_resolve_specific_migrationfrom runningADD COLUMN/DROP COLUMNwithoutIF [NOT] EXISTS, failing immediately on pre-existing instances where the schema was already modifiedChanges
return Trueafter successful P3018 idempotent error recoveryIF NOT EXISTS/IF EXISTSclauses🧹 Bug Fix