-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Postgresql: commitNewEntries() fail due to conflict on duplicate keys while inserting #1610
Description
Hello,
I'm using FreshRSS 1.7.0 (zip release download) with PostgreSQL 9.6.4-0+deb9u1 on Debian Stretch.
It appears, that after a while, FreshRSS don't import any more articles from RSS feeds.
I've found the issue: there's an SQL Error inside the PostgreSQL logs (sorry, they are French « speaking »):
2017-08-15 06:36:30 CEST [23983-600] freshrss@freshrss ERREUR: la valeur d'une clé dupliquée rompt la contrainte unique « adrien_entry_id_feed_guid_key »
2017-08-15 06:36:30 CEST [23983-601] freshrss@freshrss DÉTAIL: La clé « (id_feed, guid)=(9, http://www.minimachines.net/?p=53460) » existe déjà.
2017-08-15 06:36:30 CEST [23983-602] freshrss@freshrss CONTEXTE : instruction SQL « INSERT INTO "adrien_entry" (id, guid, title, author, content, link, date, "lastSeen", hash, is_read, is_favorite, id_feed, tag
s)
(SELECT rank + row_number() OVER(ORDER BY date) AS id, guid, title, author, content, link, date, "lastSeen", hash, is_read, is_favorite, id_feed, tags FROM "adrien_entrytmp" ORDER BY date
) »
fonction PL/pgsql inline_code_block, ligne 6 à instruction SQL
2017-08-15 06:36:30 CEST [23983-603] freshrss@freshrss INSTRUCTION : DO $$
DECLARE
maxrank bigint := (SELECT MAX(id) FROM "adrien_entrytmp");
rank bigint := (SELECT maxrank - COUNT(*) FROM "adrien_entrytmp");
BEGIN
INSERT INTO "adrien_entry" (id, guid, title, author, content, link, date, "lastSeen", hash, is_read, is_favorite, id_feed, tags)
(SELECT rank + row_number() OVER(ORDER BY date) AS id, guid, title, author, content, link, date, "lastSeen", hash, is_read, is_favorite, id_feed, tags FROM "adrien_entrytmp" ORDER BY date
);
DELETE FROM "adrien_entrytmp" WHERE id <= maxrank;
END $$;
2017-08-15 06:36:30 CEST [23983-604] freshrss@freshrss ERREUR: la transaction est annulée, les commandes sont ignorées jusqu'à la fin du bloc
de la transaction
Little bit explanation, as text is in French: first line explains that an error occurred due to a conflict on key constraint uniqueness (id_feed, guid). Next line explains that there's already an entry with values (9, http://www.minimachines.net/?p=53460) for this key.
Two next lines give the exact transaction sql query which failed and advertise that all the transaction has been canceled.
So, I've checked files EntryDAO.php and EntryDAOPGSQL.php to see the difference between MySQL and PostgreSQL statements.
The first one is:
public function commitNewEntries() {
$sql = 'SET @rank=(SELECT MAX(id) - COUNT(*) FROM `' . $this->prefix . 'entrytmp`); ' . //MySQL-specific
'INSERT IGNORE INTO `' . $this->prefix . 'entry` (id, guid, title, author, content_bin, link, date, `lastSeen`, hash, is_read, is_favorite, id_feed, tags) ' .
'SELECT @rank:=@rank+1 AS id, guid, title, author, content_bin, link, date, `lastSeen`, hash, is_read, is_favorite, id_feed, tags FROM `' . $this->prefix . 'entrytmp` ORDER BY date; ' .
'DELETE FROM `' . $this->prefix . 'entrytmp` WHERE id <= @rank;';
$hadTransaction = $this->bd->inTransaction();
if (!$hadTransaction) {
$this->bd->beginTransaction();
}
$result = $this->bd->exec($sql) !== false;
if (!$hadTransaction) {
$this->bd->commit();
}
return $result;
}The second one is:
public function commitNewEntries() {
$sql = 'DO $$
DECLARE
maxrank bigint := (SELECT MAX(id) FROM `' . $this->prefix . 'entrytmp`);
rank bigint := (SELECT maxrank - COUNT(*) FROM `' . $this->prefix . 'entrytmp`);
BEGIN
INSERT INTO `' . $this->prefix . 'entry` (id, guid, title, author, content, link, date, `lastSeen`, hash, is_read, is_favorite, id_feed, tags)
(SELECT rank + row_number() OVER(ORDER BY date) AS id, guid, title, author, content, link, date, `lastSeen`, hash, is_read, is_favorite, id_feed, tags FROM `' . $this->prefix . 'entrytmp` ORDER BY date);
DELETE FROM `' . $this->prefix . 'entrytmp` WHERE id <= maxrank;
END $$;';
$hadTransaction = $this->bd->inTransaction();
if (!$hadTransaction) {
$this->bd->beginTransaction();
}
$result = $this->bd->exec($sql) !== false;
if (!$hadTransaction) {
$this->bd->commit();
}
return $result;
}As you can see, MySQL original code use INSERT IGNORE instruction and PostgreSQL use simply INSERT and so fails on duplicates.
To fix that, we have three solutions:
- Add a sentence inside the transaction to first
DELETEduplicates between tables_entryand_entrytmptables - Modify the
SELECTstatement to exclude duplicates to the result set (i.e. modify theWHEREpart) - Use the
ON CONFLICT DO NOTHINGnew PostgreSQL feature (>=9.5 according to online doc)
I've tried the third solution on my server and it worked well on next sync: all unread articles appeared and adrien_entrytmp has been well cleared.
I'm making two Pull Requests for first and third solutions which are simplest. If I've time, I'll do one for the second solution.