Skip to content

SQL: Same updateCacheUnreads for all DBs#5648

Merged
Alkarex merged 5 commits intoFreshRSS:edgefrom
Alkarex:updateCacheUnreads
Sep 12, 2023
Merged

SQL: Same updateCacheUnreads for all DBs#5648
Alkarex merged 5 commits intoFreshRSS:edgefrom
Alkarex:updateCacheUnreads

Conversation

@Alkarex
Copy link
Copy Markdown
Member

@Alkarex Alkarex commented Sep 10, 2023

Use same SQL update request for MySQL / MariaDB than the one we already used for PostgreSQL / SQLite (i.e. using a sub-query).

Testing on a DB of 688MB with 270k entries, 199 feeds, 19 categories, using MySQL 8.1.0 and MariaDB 11.1.

The new SQL update using a sub-query took in average 0.02s, while the old SQL update using a join took in average 0.05s. SQL cache was properly invalidated between each run. The new SQL request is thus about twice faster on that test. EXPLAIN also shows an intrinsic advantage, and it might lead to fewer locks.

Another advantage of the SQL update using a sub-query is that it works identically in PostgreSQL, SQLite, MariaDB, MySQL, so we do need different versions anymore.

Contributes to #5008 (comment)

Edit: But with a USE INDEX hint for MySQL/MariaDB.

Use same SQL update request for MySQL / MariaDB than the one we already used for PostgreSQL / SQLite (i.e. using a sub-query).

Testing on a DB of 688MB with 270k entries, 199 feeds, 19 categories, using MySQL 8.1.0.

The new SQL update using a sub-query took in average 0.02s, while the old SQL update using a join took in average 0.05s. SQL cache was properly invalidated between each run. The new SQL request is thus about twice faster.

Another advantage of the SQL update using a sub-query is that it works identically in PostgreSQL, SQLite, MariaDB, MySQL, so we do need different versions anymore.

Contributes to FreshRSS#5008 (comment)
@Alkarex Alkarex added this to the 1.22.0 milestone Sep 10, 2023
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

For the new SQL request, using MySQL 8.1.0:

EXPLAIN UPDATE freshrss_feed
SET `cache_nbUnreads`=(
	SELECT COUNT(*) AS nbUnreads FROM freshrss_entry e
	WHERE e.id_feed=freshrss_feed.id AND e.is_read=0);
+----+--------------------+---------------+------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+----------+-------------+
| id | select_type        | table         | partitions | type  | possible_keys                         | key                   | key_len | ref                                | rows | filtered | Extra       |
+----+--------------------+---------------+------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+----------+-------------+
|  1 | UPDATE             | freshrss_feed | NULL       | index | NULL                                  | PRIMARY               | 4       | NULL                               |  199 |   100.00 | NULL        |
|  2 | DEPENDENT SUBQUERY | e             | NULL       | ref   | id_feed,is_read,entry_feed_read_index | entry_feed_read_index | 6       | freshrss_db.freshrss_feed.id,const |  720 |   100.00 | Using index |
+----+--------------------+---------------+------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+----------+-------------+
2 rows in set, 2 warnings (0.01 sec)

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

This was for the old SQL request:

EXPLAIN UPDATE `freshrss_feed` f LEFT OUTER JOIN (
    SELECT e.id_feed, COUNT(*) AS nbUnreads
    FROM `freshrss_entry` e
    WHERE e.is_read = 0
    GROUP BY e.id_feed
  ) x ON x.id_feed = f.id
SET f.`cache_nbUnreads` = COALESCE(x.nbUnreads, 0);
+----+-------------+------------+------------+------+---------------------------------------+-------------+---------+------------------+-------+----------+-----------------+
| id | select_type | table      | partitions | type | possible_keys                         | key         | key_len | ref              | rows  | filtered | Extra           |
+----+-------------+------------+------------+------+---------------------------------------+-------------+---------+------------------+-------+----------+-----------------+
|  1 | UPDATE      | f          | NULL       | ALL  | NULL                                  | NULL        | NULL    | NULL             |   199 |   100.00 | NULL            |
|  1 | PRIMARY     | <derived2> | NULL       | ref  | <auto_key0>                           | <auto_key0> | 5       | freshrss_db.f.id |    87 |   100.00 | NULL            |
|  2 | DERIVED     | e          | NULL       | ref  | id_feed,is_read,entry_feed_read_index | is_read     | 1       | const            | 17336 |   100.00 | Using temporary |
+----+-------------+------------+------------+------+---------------------------------------+-------------+---------+------------------+-------+----------+-----------------+
3 rows in set, 1 warning (0.00 sec)

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

Ping @Emporea

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 10, 2023

EXPLAIN UPDATE freshrss_empo_feed SET `cache_nbUnreads`=( SELECT COUNT(*) AS nbUnreads FROM freshrss_empo_entry e WHERE e.id_feed=freshrss_empo_feed.id AND e.is_read=0);
+------+--------------------+--------------------+-------+---------------------------------------+---------+---------+--------------------------------+------+-------------+
| id   | select_type        | table              | type  | possible_keys                         | key     | key_len | ref                            | rows | Extra       |
+------+--------------------+--------------------+-------+---------------------------------------+---------+---------+--------------------------------+------+-------------+
|    1 | PRIMARY            | freshrss_empo_feed | index | NULL                                  | PRIMARY | 4       | NULL                           | 36   |             |
|    2 | DEPENDENT SUBQUERY | e                  | ref   | id_feed,is_read,entry_feed_read_index | id_feed | 5       | freshrss.freshrss_empo_feed.id | 270  | Using where |
+------+--------------------+--------------------+-------+---------------------------------------+---------+---------+--------------------------------+------+-------------+
2 rows in set (0.010 sec)
EXPLAIN UPDATE `freshrss_empo_feed` f LEFT OUTER JOIN (     SELECT e.id_feed, COUNT(*) AS nbUnreads     FROM `freshrss_empo_entry` e     WHERE e.is_read = 0     GROUP BY e.id_feed   ) x ON x.id_feed = f.id SET f.`cache_nbUnreads` = COALESCE(x.nbUnreads, 0);
+------+-------------+------------+------+---------------------------------------+---------+---------+---------------+-------+----------------------------------------------+
| id   | select_type | table      | type | possible_keys                         | key     | key_len | ref           | rows  | Extra                                        |
+------+-------------+------------+------+---------------------------------------+---------+---------+---------------+-------+----------------------------------------------+
|    1 | PRIMARY     | f          | ALL  | NULL                                  | NULL    | NULL    | NULL          | 36    |                                              |
|    1 | PRIMARY     | <derived2> | ref  | key0                                  | key0    | 5       | freshrss.f.id | 258   |                                              |
|    2 | DERIVED     | e          | ref  | id_feed,is_read,entry_feed_read_index | is_read | 1       | const         | 25878 | Using where; Using temporary; Using filesort |
+------+-------------+------------+------+---------------------------------------+---------+---------+---------------+-------+----------------------------------------------+
3 rows in set (0.001 sec)

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 10, 2023

Could you explain me how to "break the cache" and why?
And how can i provide more info about my configuration?
I am using the docker mariadb 10.6 with these options enabled:

  • MARIADB_AUTO_UPGRADE=1
  • MARIADB_DISABLE_UPGRADE_BACKUP=1

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

I have just tried with MariaDB 11.1 with similar results than my tests with MySQL. Both with Docker.

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 17
Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204 mariadb.org binary distribution

MariaDB [freshrss_db]> UPDATE freshrss_feed SET `cache_nbUnreads`=( SELECT COUNT(*) AS nbUnreads FROM freshrss_entry e WHERE e.id_feed=freshrss_feed.id AND e.is_read=0);
Query OK, 0 rows affected (0.011 sec)
Rows matched: 199  Changed: 0  Warnings: 0

MariaDB [freshrss_db]> UPDATE `freshrss_feed` f LEFT OUTER JOIN (
    ->     SELECT e.id_feed, COUNT(*) AS nbUnreads
    ->     FROM `freshrss_entry` e
    ->     WHERE e.is_read = 0
    ->     GROUP BY e.id_feed
    ->   ) x ON x.id_feed = f.id
    -> SET f.`cache_nbUnreads` = COALESCE(x.nbUnreads, 0);
Query OK, 0 rows affected (0.057 sec)
Rows matched: 199  Changed: 0  Warnings: 0

MariaDB [freshrss_db]> EXPLAIN UPDATE `freshrss_feed` f LEFT OUTER JOIN (
    ->     SELECT e.id_feed, COUNT(*) AS nbUnreads
    ->     FROM `freshrss_entry` e
    ->     WHERE e.is_read = 0
    ->     GROUP BY e.id_feed
    ->   ) x ON x.id_feed = f.id
    -> SET f.`cache_nbUnreads` = COALESCE(x.nbUnreads, 0);
+------+-------------+------------+------+---------------------------------------+---------+---------+------------------+-------+----------------------------------------------+
| id   | select_type | table      | type | possible_keys                         | key     | key_len | ref              | rows  | Extra                                        |
+------+-------------+------------+------+---------------------------------------+---------+---------+------------------+-------+----------------------------------------------+
|    1 | PRIMARY     | f          | ALL  | NULL                                  | NULL    | NULL    | NULL             | 199   |                                              |
|    1 | PRIMARY     | <derived2> | ref  | key0                                  | key0    | 5       | freshrss_db.f.id | 94    |                                              |
|    2 | DERIVED     | e          | ref  | id_feed,is_read,entry_feed_read_index | is_read | 1       | const            | 18706 | Using where; Using temporary; Using filesort |
+------+-------------+------------+------+---------------------------------------+---------+---------+------------------+-------+----------------------------------------------+
3 rows in set (0.002 sec)

MariaDB [freshrss_db]> EXPLAIN UPDATE freshrss_feed
    -> SET `cache_nbUnreads`=(
    -> SELECT COUNT(*) AS nbUnreads FROM freshrss_entry e
    -> WHERE e.id_feed=freshrss_feed.id AND e.is_read=0);
+------+--------------------+---------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+-------------+
| id   | select_type        | table         | type  | possible_keys                         | key                   | key_len | ref                                | rows | Extra       |
+------+--------------------+---------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+-------------+
|    1 | PRIMARY            | freshrss_feed | index | NULL                                  | PRIMARY               | 4       | NULL                               | 199  |             |
|    2 | DEPENDENT SUBQUERY | e             | ref   | id_feed,is_read,entry_feed_read_index | entry_feed_read_index | 6       | freshrss_db.freshrss_feed.id,const | 879  | Using index |
+------+--------------------+---------------+-------+---------------------------------------+-----------------------+---------+------------------------------------+------+-------------+
2 rows in set (0.001 sec)

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

Could you explain me how to "break the cache" and why?

Just manually mark a few articles as read in the FreshRSS Web interface, from a few different feeds, to make sure no sub-query is cached by the DB.

From your EXPLAIN commands, no index is used, which is probably the main reason why it is so slow, so this is what needs to be investigated.

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

@Emporea If your DB is not too private, you could export it with our CLI to an SQLite file and send it to me so I can try to reproduce. If that is not an option, we will try a few more things from remote.

What kind of server are you operating on? In particular: OS, processor architecture, type of disk

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

@Emporea Could you try this variant, in which we force the query plan to help MariaDB's optimizer?:

UPDATE freshrss_empo_feed
SET `cache_nbUnreads`=(
  SELECT COUNT(*) AS nbUnreads FROM freshrss_empo_entry e
  USE INDEX (entry_feed_read_index)
  WHERE e.id_feed=freshrss_empo_feed.id AND e.is_read=0);

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 10, 2023

MariaDB [freshrss]> UPDATE freshrss_empo_feed
    -> SET `cache_nbUnreads`=(
    ->   SELECT COUNT(*) AS nbUnreads FROM freshrss_empo_entry e
    ->   USE INDEX (entry_feed_read_index)
    ->   WHERE e.id_feed=freshrss_empo_feed.id AND e.is_read=0);
Query OK, 0 rows affected (0.020 sec)
Rows matched: 36  Changed: 0  Warnings: 0

This is faster.

Do you think this is on the freshrss side?

Or are my other databases also affected and I could have much better performance overall? Only freshrss caught my eye as I have been getting these 504's and locked databases in the logs a lot lately.

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 10, 2023

@Emporea If your DB is not too private, you could export it with our CLI to an SQLite file and send it to me so I can try to reproduce. If that is not an option, we will try a few more things from remote.

@Alkarex I am also fine with sharing my database. I don't actually use freshrss to read the news by hand. It's just to collect rss data for a small data science project.

I exported it with export-sqlite-for-user.php as you suggested (I do this regularly anyway, but with mysql-to-sqlite ).

How should I send this file to you?

What kind of server are you operating on? In particular: OS, processor architecture, type of disk

Debian12, x86_64 Intel Xeon, NVME SSD.

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 10, 2023

Ok, thanks for the tests. That looks much better now :-)

I believe this is a case of MariaDB's optimizer making an unfortunate decision for the query plan.
This is not especially related to FreshRSS (as you could see, this is pure SQL), and might happen for other requests as well. It might also go away when the database size diminishes or increases, or when anything else changes which might impact the optimizer's heuristics.

Since we have a dedicated index, which is supposed to be used all the time, I have hardcoded a USE INDEX hint to help MariaDB's optimizer.

Could you just try the current PR? If it works, then we should be good to go, and no need to share your DB.

P.S. Pure speculation, but amusingly, it might be that it would have run by default must faster on a slow hard-disk than on your SSD, because the cost of the scanning the whole DB would have been higher, thus favouring the use of the index :-P

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 10, 2023

Could you just try the current PR?

How? Sorry I am not really a developer ^^.
Should I replace the files directly? Should I build my own docker image from the latest git dev version?

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2023

Here is an option: Edit your docker-compose on the model of https://github.com/FreshRSS/FreshRSS/blob/edge/Docker/freshrss/docker-compose.yml with the following information:

    build:
      context: https://github.com/Alkarex/FreshRSS.git#updateCacheUnreads
      dockerfile: Docker/Dockerfile-Alpine

And then execute a docker-compose up just like you normally do.

See also https://github.com/FreshRSS/FreshRSS/blob/edge/Docker/README.md#build-custom-docker-image

Let me know if that works for you, otherwise I will propose you something else.

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 11, 2023

I hit "mark all as read" with this docker config. It still gives me 504, and the container has a high CPU usage for some minutes.
But "Optimize database" and "Purge now" work in a matter of seconds, and don't produce 504.

  freshrss_db:
    build: ./mariadb
    container_name: freshrss_db
    restart: unless-stopped
    logging:
      options:
        max-size: 10m
    volumes:
      - freshrss_db:/var/lib/mysql
    environment:
      - TZ=${TIMEZONE}
      - MYSQL_ROOT_PASSWORD=${FRESHRSS_DB_PASSWORD}
      - MYSQL_PASSWORD=${FRESHRSS_DB_PASSWORD}
      - MYSQL_DATABASE=${FRESHRSS_DB_DATABASE}
      - MYSQL_USER=${FRESHRSS_DB_USER}
      - MARIADB_AUTO_UPGRADE=1
      - MARIADB_DISABLE_UPGRADE_BACKUP=1
      - NETWORK_ACCESS=internal

  freshrss:
    #image: freshrss/freshrss:latest
    build:
      context: https://github.com/Alkarex/FreshRSS.git#updateCacheUnreads
      dockerfile: Docker/Dockerfile-Alpine
    container_name: freshrss
    restart: unless-stopped
    logging:
      options:
        max-size: 10m
    volumes:
      - freshrss_data:/var/www/FreshRSS/data
      - freshrss_extensions:/var/www/FreshRSS/extensions
    environment:
      VIRTUAL_HOST: ${FRESHRSS_HOST}
      NETWORK_ACCESS: internal
      LETSENCRYPT_HOST: ${FRESHRSS_HOST}
      FRESHRSS_INSTALL: |-
        --api_enabled
        --base_url "https://${FRESHRSS_HOST}"
        --db-base ${FRESHRSS_DB_DATABASE}
        --db-host ${FRESHRSS_DB_HOST}
        --db-password ${FRESHRSS_DB_PASSWORD}
        --db-type mysql
        --db-user ${FRESHRSS_DB_USER}
        --default_user ${FRESHRSS_USER}
        --language en
      FRESHRSS_USER: |-
        --api_password ${PASSWORD}
        --language en
        --password ${PASSWORD}
        --user ${FRESHRSS_USER}
      TZ: ${TIMEZONE}

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2023

Can you see which SQL request is running when it takes that long?

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 11, 2023

| 217 | freshrss | 172.18.0.4:54210 | freshrss | Execute |    1 | Creating sort index | SELECT e0.id, e0.guid, e0.title, e0.author, UNC
OMPRESS(e0.content_bin) AS content, e0.link, e0.date, hex(e0.hash) AS hash, e0.is_read, e0.is_favorite, e0.id_feed, e0.tags, e0.attribu
tes
FROM `freshrss_empo_entry` e0
INNER JOIN (SELECT e.id FROM `freshrss_empo_entry` e INNER JOIN `freshrss_empo_feed` f ON e.id_feed = f.id WHERE f.priority > 0  AND e.
id <= ? ORDER BY e.id DESC LIMIT 20) e2 ON e2.id=e0.id
ORDER BY e0.id DESC |    0.000 |

Thats probably not the query we are looking for. I had 10000 unread before. Now it was only about 200 maybe. I didnt get 504 and probably wasnt fast enough with the "show processlist" commnad

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2023

Could you try those three variants?

EXPLAIN UPDATE freshrss_empo_entry e
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000;

EXPLAIN UPDATE freshrss_empo_entry e
INNER JOIN freshrss_empo_feed f ON e.id_feed = f.id
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000
AND f.priority > 10;

EXPLAIN UPDATE freshrss_empo_entry e
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000
AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 10);

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 11, 2023

EXPLAIN UPDATE freshrss_empo_entry e
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000;
+------+-------------+-------+-------+-----------------+---------+---------+------+------+---------------------------+
| id   | select_type | table | type  | possible_keys   | key     | key_len | ref  | rows | Extra                     |
+------+-------------+-------+-------+-----------------+---------+---------+------+------+---------------------------+
|    1 | SIMPLE      | e     | range | PRIMARY,is_read | is_read | 1       | NULL | 2    | Using where; Using buffer |
+------+-------------+-------+-------+-----------------+---------+---------+------+------+---------------------------+
1 row in set (0.001 sec)

EXPLAIN UPDATE freshrss_empo_entry e
INNER JOIN freshrss_empo_feed f ON e.id_feed = f.id
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000
AND f.priority > 10;
+------+-------------+-------+-------+-----------------------------------------------+----------+---------+------+------+--------------------------+
| id   | select_type | table | type  | possible_keys                                 | key      | key_len | ref  | rows | Extra                    |
+------+-------------+-------+-------+-----------------------------------------------+----------+---------+------+------+--------------------------+
|    1 | SIMPLE      | f     | range | PRIMARY,priority                              | priority | 1       | NULL | 1    | Using where; Using index |
|    1 | SIMPLE      | e     | range | PRIMARY,id_feed,is_read,entry_feed_read_index | is_read  | 1       | NULL | 2    | Using where              |
+------+-------------+-------+-------+-----------------------------------------------+----------+---------+------+------+--------------------------+
2 rows in set (0.001 sec)

EXPLAIN UPDATE freshrss_empo_entry e
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694438602000000
AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 10);
+------+--------------------+-------+-------+------------------+----------+---------+------+------+---------------------------+
| id   | select_type        | table | type  | possible_keys    | key      | key_len | ref  | rows | Extra                     |
+------+--------------------+-------+-------+------------------+----------+---------+------+------+---------------------------+
|    1 | PRIMARY            | e     | range | PRIMARY,is_read  | is_read  | 1       | NULL | 2    | Using where; Using buffer |
|    2 | DEPENDENT SUBQUERY | f     | range | PRIMARY,priority | priority | 1       | NULL | 1    | Using where; Using index  |
+------+--------------------+-------+-------+------------------+----------+---------+------+------+---------------------------+
2 rows in set (0.000 sec)

If you need my database just tell me and I will send a link to your profile email.

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2023

If you need my database just tell me and I will send a link to your profile email.

Yes, I think that would be good after all, so I can try to replicate. Visibly, MariaDB has some difficulties picking the correct available indexes at the right time

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2023

I have well received the DB copy, but all the SQL queries I am trying are relatively snappy and perform in at max 1.5s, even with 100k+ unread articles.

Could you try to see how long three variants take on your side?

They will mark all articles as read, so it it best if you try these just after you are done reading everything. For the test, you can click on the open envelope to show read articles, use the search field with date:P15d/ to search for all articles within the past 15 days (adjust for your typical use-case) and mark all those articles as unread. Repeat for each of the queries. When you are done, click the "mark all as read" button again, to sync the Web interface again.

image

UPDATE freshrss_empo_entry e
INNER JOIN freshrss_empo_feed f ON e.id_feed = f.id
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694470099000000
AND f.priority > 0;

UPDATE freshrss_empo_entry e
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694470099000000
AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 0);

UPDATE freshrss_empo_entry e
USE INDEX (is_read)
SET e.is_read = 1
WHERE e.is_read <> 1 AND e.id <= 1694470099000000
AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 0);

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 12, 2023

Will do, but could you test with mariadb 10.6 too? That is the one I am using. If I need to upgrade then I will have a look at it.

@Emporea
Copy link
Copy Markdown

Emporea commented Sep 12, 2023

Date:P15d/' gave me about 50,000 messages.

Marking all as read or unread via the web ui is now fast and doesn't give me 504's.

MariaDB [freshrss]> UPDATE freshrss_empo_entry e
    -> INNER JOIN freshrss_empo_feed f ON e.id_feed = f.id
    -> SET e.is_read = 1
    -> WHERE e.is_read <> 1 AND e.id <= 1694470099000000
    -> AND f.priority > 0;
Query OK, 52839 rows affected (1 min 27.746 sec)
Rows matched: 52839  Changed: 52839  Warnings: 0
MariaDB [freshrss]> UPDATE freshrss_empo_entry e
    -> SET e.is_read = 1
    -> WHERE e.is_read <> 1 AND e.id <= 1694470099000000
    -> AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 0);
Query OK, 52839 rows affected (3.084 sec)
Rows matched: 52839  Changed: 52839  Warnings: 0
MariaDB [freshrss]> UPDATE freshrss_empo_entry e
    -> USE INDEX (is_read)
    -> SET e.is_read = 1
    -> WHERE e.is_read <> 1 AND e.id <= 1694470099000000
    -> AND id_feed IN (SELECT f.id FROM freshrss_empo_feed f WHERE f.priority > 0);
Query OK, 52839 rows affected (1.122 sec)
Rows matched: 52839  Changed: 52839  Warnings: 0

The first one takes the longest.
I still get 504's when I click on any of the subscription management statistics such as Main Statistics, Idle Feeds or Articles repartition. I guess now it should be easy to also update those queries.

Thanks again for your effort btw.

@Alkarex Alkarex merged commit 0bf33ab into FreshRSS:edge Sep 12, 2023
@Alkarex Alkarex deleted the updateCacheUnreads branch September 12, 2023 11:44
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2023

I have re-tested with MariaDB, MySQL, PostgreSQL, all in latest versions, as well as with the built-in SQLite.

I have not performed an exhaustive and systematic test, but on your dataset, some clear trends seem to appear nevertheless.
I have given up on MariaDB for over 10k unread articles, as some incredibly slow behaviours appear (requests over 1 or even 5 minutes - some of it maybe due to internal behaviours), while PostgreSQL and SQLite answer more consistently in under one second. From those tests, the overall experience seems to be:

🐇 PostgreSQL > SQLite >> MySQL > MariaDB 🐢

Here is some data from MySQL on my computer: good performance with 4k unread articles, bad performance with 10k unread articles.
Note that I gave up for now with testing MariaDB, as the performance changed after restarting the Docker container (ok before restart, very slow afterwards), so there might be something else going on. If anyone is volunteering to perform more systematic performance tests, please reach out.

MySQL
-- ---- 3 days of articles: good performance

mysql> SELECT COUNT(*) FROM freshrss_entry WHERE is_read = 0;
+----------+
| COUNT(*) |
+----------+
|     4211 |
+----------+
1 row in set (0.01 sec)

mysql> SELECT SQL_NO_CACHE e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id ORDER BY e0.id DESC;
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
| id               | title                                                       | link                                                   |
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
| 1694378467606928 | POL-HAM: Schwarzer Mercedes nach Verkehrsunfall fl�chtig    | https://www.presseportal.de/blaulicht/pm/65844/5599605 |
| 1694378467606927 | POL-HSK: Verkehrsunfall mit schwerverletztem Motorradfahrer | https://www.presseportal.de/blaulicht/pm/65847/5599604 |
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id
ORDER BY e0.id DESC;
+----+-------------+------------+------------+--------+---------------------------------------+---------+---------+-----------------------+------+----------+----------------------------------+
| id | select_type | table      | partitions | type   | possible_keys                         | key     | key_len | ref                   | rows | filtered | Extra                            |
+----+-------------+------------+------------+--------+---------------------------------------+---------+---------+-----------------------+------+----------+----------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL                                  | NULL    | NULL    | NULL                  |    2 |   100.00 | Using temporary; Using filesort  |
|  1 | PRIMARY     | e0         | NULL       | eq_ref | PRIMARY                               | PRIMARY | 8       | e2.id                 |    1 |   100.00 | NULL                             |
|  2 | DERIVED     | e          | NULL       | ref    | id_feed,is_read,entry_feed_read_index | is_read | 1       | const                 | 4211 |   100.00 | Using where; Backward index scan |
|  2 | DERIVED     | f          | NULL       | eq_ref | PRIMARY,priority                      | PRIMARY | 4       | freshrss_db.e.id_feed |    1 |   100.00 | Using where                      |
+----+-------------+------------+------------+--------+---------------------------------------+---------+---------+-----------------------+------+----------+----------------------------------+
4 rows in set, 2 warnings (0.00 sec)

-- ----- 5 days of articles: bad performance

mysql> SELECT COUNT(*) FROM freshrss_entry WHERE is_read = 0;
+----------+
| COUNT(*) |
+----------+
|    11651 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT SQL_NO_CACHE e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id ORDER BY e0.id DESC;
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
| id               | title                                                       | link                                                   |
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
| 1694378467606928 | POL-HAM: Schwarzer Mercedes nach Verkehrsunfall fl�chtig    | https://www.presseportal.de/blaulicht/pm/65844/5599605 |
| 1694378467606927 | POL-HSK: Verkehrsunfall mit schwerverletztem Motorradfahrer | https://www.presseportal.de/blaulicht/pm/65847/5599604 |
+------------------+-------------------------------------------------------------+--------------------------------------------------------+
2 rows in set, 1 warning (31.64 sec)

mysql> EXPLAIN SELECT SQL_NO_CACHE e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id
ORDER BY e0.id DESC;
+----+-------------+------------+------------+--------+---------------------------------------+----------+---------+------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table      | partitions | type   | possible_keys                         | key      | key_len | ref              | rows | filtered | Extra                                                     |
+----+-------------+------------+------------+--------+---------------------------------------+----------+---------+------------------+------+----------+-----------------------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL                                  | NULL     | NULL    | NULL             |    2 |   100.00 | Using temporary; Using filesort                           |
|  1 | PRIMARY     | e0         | NULL       | eq_ref | PRIMARY                               | PRIMARY  | 8       | e2.id            |    1 |   100.00 | NULL                                                      |
|  2 | DERIVED     | f          | NULL       | index  | PRIMARY,priority                      | priority | 1       | NULL             |   36 |   100.00 | Using where; Using index; Using temporary; Using filesort |
|  2 | DERIVED     | e          | NULL       | ref    | id_feed,is_read,entry_feed_read_index | id_feed  | 5       | freshrss_db.f.id |  621 |     1.61 | Using where                                               |
+----+-------------+------------+------------+--------+---------------------------------------+----------+---------+------------------+------+----------+-----------------------------------------------------------+
4 rows in set, 2 warnings (0.00 sec)
PostgreSQL
-- ---- 3 days of articles: good performance

freshrss_db=# SELECT COUNT(*) FROM freshrss_entry WHERE is_read = 0;
 count 
-------
  4116
(1 row)

Time: 13.994 ms

freshrss_db=# SELECT e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id ORDER BY e0.id DESC;
        id        |                            title                            |                          link                          
------------------+-------------------------------------------------------------+--------------------------------------------------------
 1694378467606928 | POL-HAM: Schwarzer Mercedes nach Verkehrsunfall flüchtig    | https://www.presseportal.de/blaulicht/pm/65844/5599605
 1694378467606927 | POL-HSK: Verkehrsunfall mit schwerverletztem Motorradfahrer | https://www.presseportal.de/blaulicht/pm/65847/5599604
(2 rows)

Time: 1.330 ms

-- ---- 5 days of articles: good performance

freshrss_db=# SELECT COUNT(*) FROM freshrss_entry WHERE is_read = 0;
 count 
-------
 11651
(1 row)

Time: 11.773 ms

freshrss_db=# SELECT e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id ORDER BY e0.id DESC;
        id        |                            title                            |                          link                          
------------------+-------------------------------------------------------------+--------------------------------------------------------
 1694378467606928 | POL-HAM: Schwarzer Mercedes nach Verkehrsunfall flüchtig    | https://www.presseportal.de/blaulicht/pm/65844/5599605
 1694378467606927 | POL-HSK: Verkehrsunfall mit schwerverletztem Motorradfahrer | https://www.presseportal.de/blaulicht/pm/65847/5599604
(2 rows)

Time: 1.367 ms

-- ---- 1 month of articles: good performance

freshrss_db=# SELECT COUNT(*) FROM freshrss_entry WHERE is_read = 0;
 count  
--------
 110466
(1 row)

Time: 159.669 ms

freshrss_db=# SELECT e0.id, e0.title, e0.link FROM freshrss_entry e0 INNER JOIN (SELECT e.id FROM freshrss_entry e INNER JOIN freshrss_feed f ON e.id_feed = f.id WHERE f.priority > 0  AND e.is_read=0 ORDER BY e.id DESC LIMIT 2) e2 ON e2.id=e0.id ORDER BY e0.id DESC;
        id        |                            title                            |                          link                          
------------------+-------------------------------------------------------------+--------------------------------------------------------
 1694378467606928 | POL-HAM: Schwarzer Mercedes nach Verkehrsunfall flüchtig    | https://www.presseportal.de/blaulicht/pm/65844/5599605
 1694378467606927 | POL-HSK: Verkehrsunfall mit schwerverletztem Motorradfahrer | https://www.presseportal.de/blaulicht/pm/65847/5599604
(2 rows)

Time: 1.740 ms

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2023

@Emporea Please change your docker-compose back to freshrss/freshrss:edge, and additional tests and feedback welcome

@Alkarex Alkarex mentioned this pull request Sep 26, 2023
Alkarex added a commit to Alkarex/FreshRSS that referenced this pull request Oct 31, 2023
Alkarex added a commit that referenced this pull request Oct 31, 2023
* Update DB recommendations
Favour PostgreSQL, based on tests such as:
#5648 (comment)
#5707

SQLite is generally very fine, although with the challenge that the DB model cannot easily be updated https://www.sqlite.org/lang_altertable.html#why_alter_table_is_such_a_problem_for_sqlite
(We should ship a command-line update mechanism)

I have verified that FreshRSS still works with `mysql:5.5.62`
https://hub.docker.com/layers/library/mysql/5.5.62/images/sha256-d404d78aa797c87c255e5ae2beb5d8d0e4d095f930b1f20dc208eaa957477b74?context=explore

* sqlite
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants