Skip to content

Partial index WHERE clause causes repeated migrations due to PostgreSQL normalization #568

@ken39arg

Description

@ken39arg

When using partial indexes with WHERE clauses in PostgreSQL, Ridgepole detects false differences on every run, causing unnecessary DROP INDEX and CREATE INDEX operations.
This happens because PostgreSQL normalizes WHERE clauses (converts to uppercase, adds parentheses), but Ridgepole compares against the original definition.

Steps to Reproduce

  1. Define a partial index:
add_index :users, :email, unique: true, where: "email is not null"
  1. Apply the schema:
$ ridgepole --apply
  1. Run again (dry-run or apply):
$ ridgepole --apply --dry-run

Expected Behavior

No differences should be detected after the initial migration.

Actual Behavior

Ridgepole repeatedly detects differences and attempts to recreate the same index:

remove_index "users", name: "idx_users_email"
add_index "users", ["email"], name: "idx_users_email", unique: true, where: "email is not null"

Root Cause

PostgreSQL normalizes WHERE clauses:

  • email is not null → (email IS NOT NULL)
  • active → active (boolean conditions unchanged)
  • Complex conditions get additional parentheses

Test Case

https://gist.github.com/ken39arg/66c8594af76726afb03b84c46351d944

The test demonstrates:

  • ❌ where: "email IS NOT NULL" - causes repeated migrations
  • ❌ where: "(email is not null)" - causes repeated migrations
  • ✅ where: "(email IS NOT NULL)" - works correctly
  • ✅ where: "active" - boolean conditions work correctly

Workaround

Use PostgreSQL's normalized format in your schema definitions:

add_index :users, :email, unique: true, where: "(email IS NOT NULL)"

To find the correct format, check your database after applying:

psql> SELECT indexname, indexdef FROM pg_indexes WHERE tablename = '<table name>'

Attempted Fix

I tried to implement normalization in Ridgepole::Diff#normalize_index_options! to match PostgreSQL's format, but it became clear that complete SQL normalization is complex:

  • Need to handle string literals that shouldn't be uppercased
  • Need to parse nested parentheses correctly
  • Need to handle various SQL functions and operators
  • Different PostgreSQL versions might have different normalization behavior

Proposed Solution

Due to the complexity of SQL normalization (handling strings, quoted identifiers, functions, etc.), implementing full normalization in Ridgepole would be error-prone. Instead, we should:

  1. Document this behavior in README
  2. Provide clear workaround instructions
  3. Recommend using --dry-run after --apply to verify no unexpected differences

more detail normalize behavior

CREATE TABLE test_users (
    id SERIAL PRIMARY KEY,
    email TEXT,
    name TEXT,
    active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP
);

CREATE INDEX idx_email_no_paren ON test_users (email) WHERE email is not null;
CREATE INDEX idx_email_with_paren ON test_users (email) WHERE (email is not null);
CREATE INDEX idx_email_upper_no_paren ON test_users (email) WHERE email IS NOT NULL;
CREATE INDEX idx_email_upper_with_paren ON test_users (email) WHERE (email IS NOT NULL);
CREATE INDEX idx_active_bool ON test_users (email) WHERE active;
CREATE INDEX idx_complex_or ON test_users (email) WHERE (email IS NOT NULL) OR active;
CREATE INDEX idx_complex_and ON test_users (email) WHERE (email IS NOT NULL) AND active = true;

SELECT indexname,
       regexp_replace(indexdef, '.*WHERE ', '') AS where_clause
FROM pg_indexes
WHERE tablename = 'test_users'
AND indexdef LIKE '%WHERE%'

Result:

indexname where_clause
idx_email_no_paren (email IS NOT NULL)
idx_email_with_paren (email IS NOT NULL)
idx_email_upper_no_paren (email IS NOT NULL)
idx_email_upper_with_paren (email IS NOT NULL)
idx_active_bool active
idx_complex_or ((email IS NOT NULL) OR active)
idx_complex_and ((email IS NOT NULL) AND (active = true))

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions