-
Notifications
You must be signed in to change notification settings - Fork 138
Description
When using partial indexes with WHERE clauses in PostgreSQL, Ridgepole detects false differences on every run, causing unnecessary DROP INDEX and CREATE INDEX operations.
This happens because PostgreSQL normalizes WHERE clauses (converts to uppercase, adds parentheses), but Ridgepole compares against the original definition.
Steps to Reproduce
- Define a partial index:
add_index :users, :email, unique: true, where: "email is not null"- Apply the schema:
$ ridgepole --apply- Run again (dry-run or apply):
$ ridgepole --apply --dry-run
Expected Behavior
No differences should be detected after the initial migration.
Actual Behavior
Ridgepole repeatedly detects differences and attempts to recreate the same index:
remove_index "users", name: "idx_users_email"
add_index "users", ["email"], name: "idx_users_email", unique: true, where: "email is not null"
Root Cause
PostgreSQL normalizes WHERE clauses:
- email is not null → (email IS NOT NULL)
- active → active (boolean conditions unchanged)
- Complex conditions get additional parentheses
Test Case
The test demonstrates:
- ❌ where: "email IS NOT NULL" - causes repeated migrations
- ❌ where: "(email is not null)" - causes repeated migrations
- ✅ where: "(email IS NOT NULL)" - works correctly
- ✅ where: "active" - boolean conditions work correctly
Workaround
Use PostgreSQL's normalized format in your schema definitions:
add_index :users, :email, unique: true, where: "(email IS NOT NULL)"
To find the correct format, check your database after applying:
psql> SELECT indexname, indexdef FROM pg_indexes WHERE tablename = '<table name>'Attempted Fix
I tried to implement normalization in Ridgepole::Diff#normalize_index_options! to match PostgreSQL's format, but it became clear that complete SQL normalization is complex:
- Need to handle string literals that shouldn't be uppercased
- Need to parse nested parentheses correctly
- Need to handle various SQL functions and operators
- Different PostgreSQL versions might have different normalization behavior
Proposed Solution
Due to the complexity of SQL normalization (handling strings, quoted identifiers, functions, etc.), implementing full normalization in Ridgepole would be error-prone. Instead, we should:
- Document this behavior in README
- Provide clear workaround instructions
- Recommend using --dry-run after --apply to verify no unexpected differences
more detail normalize behavior
CREATE TABLE test_users (
id SERIAL PRIMARY KEY,
email TEXT,
name TEXT,
active BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP
);
CREATE INDEX idx_email_no_paren ON test_users (email) WHERE email is not null;
CREATE INDEX idx_email_with_paren ON test_users (email) WHERE (email is not null);
CREATE INDEX idx_email_upper_no_paren ON test_users (email) WHERE email IS NOT NULL;
CREATE INDEX idx_email_upper_with_paren ON test_users (email) WHERE (email IS NOT NULL);
CREATE INDEX idx_active_bool ON test_users (email) WHERE active;
CREATE INDEX idx_complex_or ON test_users (email) WHERE (email IS NOT NULL) OR active;
CREATE INDEX idx_complex_and ON test_users (email) WHERE (email IS NOT NULL) AND active = true;
SELECT indexname,
regexp_replace(indexdef, '.*WHERE ', '') AS where_clause
FROM pg_indexes
WHERE tablename = 'test_users'
AND indexdef LIKE '%WHERE%'Result:
| indexname | where_clause |
|---|---|
| idx_email_no_paren | (email IS NOT NULL) |
| idx_email_with_paren | (email IS NOT NULL) |
| idx_email_upper_no_paren | (email IS NOT NULL) |
| idx_email_upper_with_paren | (email IS NOT NULL) |
| idx_active_bool | active |
| idx_complex_or | ((email IS NOT NULL) OR active) |
| idx_complex_and | ((email IS NOT NULL) AND (active = true)) |