Page MenuHomePhabricator

migrateLinksTable migration query fails for Postgres
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Updating from Mediawiki LTS to latest stable using PostgreSQL 18.1 on Ubuntu LTS server
  • run the update script (CLI or web produces the same result)

What happens?:
Task fails with the following error message:

PHP Warning:  Undefined array key 14 in /var/www/html/mediawiki/maintenance/migrateLinksTable.php on line 123
Wikimedia\Rdbms\DBQueryError from line 1225 of /var/www/html/mediawiki/includes/libs/rdbms/database/Database.php: Error 42601: ERROR:  syntax error at or near ")"
LINE 1: ...E (cl_target_id = 0 OR cl_target_id IS NULL) AND () AND cl_t...
                                                             ^

Function: MigrateLinksTable::handlePageBatch
Query: UPDATE  "categorylinks" SET cl_target_id = 132 WHERE (cl_target_id = 0 OR cl_target_id IS NULL) AND () AND cl_to = 'Quality' AND (cl_from >= 0) AND (cl_from <= 999)

#0 /var/www/html/mediawiki/includes/libs/rdbms/database/Database.php(1209): Wikimedia\Rdbms\Database->getQueryException()
#1 /var/www/html/mediawiki/includes/libs/rdbms/database/Database.php(1183): Wikimedia\Rdbms\Database->getQueryExceptionAndLog()
#2 /var/www/html/mediawiki/includes/libs/rdbms/database/Database.php(640): Wikimedia\Rdbms\Database->reportQueryError()
#3 /var/www/html/mediawiki/includes/libs/rdbms/database/Database.php(1526): Wikimedia\Rdbms\Database->query()
#4 /var/www/html/mediawiki/includes/libs/rdbms/database/DBConnRef.php(127): Wikimedia\Rdbms\Database->update()
#5 /var/www/html/mediawiki/includes/libs/rdbms/database/DBConnRef.php(462): Wikimedia\Rdbms\DBConnRef->__call()
#6 /var/www/html/mediawiki/includes/libs/rdbms/querybuilder/UpdateQueryBuilder.php(332): Wikimedia\Rdbms\DBConnRef->update()
#7 /var/www/html/mediawiki/maintenance/migrateLinksTable.php(137): Wikimedia\Rdbms\UpdateQueryBuilder->execute()
#8 /var/www/html/mediawiki/maintenance/migrateLinksTable.php(90): MigrateLinksTable->handlePageBatch()
#9 /var/www/html/mediawiki/maintenance/includes/LoggedUpdateMaintenance.php(38): MigrateLinksTable->doDBUpdates()
#10 /var/www/html/mediawiki/includes/installer/DatabaseUpdater.php(1424): MediaWiki\Maintenance\LoggedUpdateMaintenance->execute()
#11 /var/www/html/mediawiki/includes/installer/DatabaseUpdater.php(531): MediaWiki\Installer\DatabaseUpdater->migrateCategorylinks()
#12 /var/www/html/mediawiki/includes/installer/DatabaseUpdater.php(483): MediaWiki\Installer\DatabaseUpdater->runUpdates()
#13 /var/www/html/mediawiki/maintenance/update.php(184): MediaWiki\Installer\DatabaseUpdater->doUpdates()
#14 /var/www/html/mediawiki/maintenance/includes/MaintenanceRunner.php(696): UpdateMediaWiki->execute()
#15 /var/www/html/mediawiki/maintenance/run.php(53): MediaWiki\Maintenance\MaintenanceRunner->run()
#16 {main}

What should have happened instead?:
Update completed successfully

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
Mediawiki 1.45.1
PHP 8.3.6 (apache2handler)
ICU 74.2
PostgreSQL 18.1 (Ubuntu 18.1-1.pgdg24.04+2)

Other information (browser name/version, screenshots, etc.):
If I comment out the following section of the newUpdateQueryBuilder in the handlePageBatch function the task will complete and the rest of the update function will process, but then all of the Category pages are broken.

$mapping[$table]['ns'] => $ns,

Event Timeline

In trying to get it to work, I have tried the following:

  1. Updating the cl_target_id to match the linktarget_lt_id using SQL
  2. Running the update job again.

Update code:

WITH lt AS (
	SELECT
		lt_id
		,lt_namespace
		,REPLACE(UPPER(lt_title),'_',' ') AS lt_title
	FROM kb.linktarget
)
UPDATE kb.categorylinks
SET cl_target_id =
	(
		SELECT lt_id
		FROM lt
		WHERE kb.categorylinks.cl_sortkey = lt.lt_title
		LIMIT 1
	);

Now it fails due to Unique Constraint issues where cl_from, cl_target_id are duplicated. I think there was an oversight handling duplicate rows.

Below is my truncated, sorted query output:

"cl_from""cl_to""cl_sortkey""cl_sortkey_prefix""cl_timestamp""cl_collation""cl_type""cl_collation_id""cl_target_id"
...........................
27"Metallurgical_Lab""QUALITY:COLD MOUNTING METALLOGRAPHIC SAMPLES""2026-01-09 20:50:22+00""uppercase""page"137
27"Quality""QUALITY:COLD MOUNTING METALLOGRAPHIC SAMPLES""2026-01-09 20:50:22+00""uppercase""page"137
27"Sample_Preparation""QUALITY:COLD MOUNTING METALLOGRAPHIC SAMPLES""2026-01-09 20:50:22+00""uppercase""page"137
27"Work_Instruction""COLD MOUNTING METALLOGRAPHIC SAMPLES""2026-01-10 17:11:37+00""uppercase""page"159
...........................

To get the primary key to work, I did the following:

-- Schema used is 'kb' instead of default 'public'

ALTER TABLE kb.categorylinks
ADD filter_row serial;

DELETE FROM kb.categorylinks a
WHERE filter_row <> (SELECT filter_row FROM kb.categorylinks WHERE CONCAT(cl_from,cl_target_id) = CONCAT(a.cl_from,a.cl_target_id) LIMIT 1)

ALTER TABLE kb.categorylinks
DROP COLUMN filter_row;

After doing this and then commenting out the PK drop constraint (as it was already dropped) the update task went through successfully.

Change #1227390 had a related patch set uploaded (by Akaza24; author: Akaza24):

[mediawiki/core@master] migrateLinksTable: Handle constant namespace values in mapping

https://gerrit.wikimedia.org/r/1227390

Change #1227876 had a related patch set uploaded (by Shivaansh Singh; author: Shivaansh Singh):

[mediawiki/core@master] maintenance/migrateLinksTable: Fix empty AND () clause in PostgreSQL

https://gerrit.wikimedia.org/r/1227876

I have uploaded a patch regarding this task, would love to work on it!

Change #1227390 merged by jenkins-bot:

[mediawiki/core@master] migrateLinksTable: Handle constant namespace values in mapping

https://gerrit.wikimedia.org/r/1227390

Change #1236672 had a related patch set uploaded (by TheDJ; author: Akaza24):

[mediawiki/core@REL1_45] migrateLinksTable: Handle constant namespace values in mapping

https://gerrit.wikimedia.org/r/1236672

Change #1236672 merged by jenkins-bot:

[mediawiki/core@REL1_45] migrateLinksTable: Handle constant namespace values in mapping

https://gerrit.wikimedia.org/r/1236672

Change #1227876 abandoned by Zabe:

[mediawiki/core@master] maintenance/migrateLinksTable: Fix empty AND () clause in PostgreSQL

Reason:

bug has been fixed

https://gerrit.wikimedia.org/r/1227876

Zabe assigned this task to Akaza24.
TheDJ renamed this task from migrateLinksTable.php handlePageBatch Query Fails to migrateLinksTable migration query fails for Postgres.Thu, Feb 5, 9:18 AM

@Met4lGod76 @ShivaanshSingh and @Akaza24 thank you all for reporting back and working on this problem. As you might have noticed, we ended up with multiple reports and 3!! patch submissions for this, and I picked the one that I though was most ready to merge. The next patch release of 1.45 should fix this problem for all Postgres users.

If you have other Postgres problems, please report them, we unfortunately don't have a lot of active contributors using postgres, but it seems postgres is becoming more and more popular, so contributions are very welcome.