-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Query to remote table with enable_analyzer=1 try to read all table from remote or Distributed, instead of executing query remotely #81718
Description
Company or project name
bitquery
Describe the situation
Query to remote table with enable_analyzer=1 try to read all table from remote or Distributed, instead of executing query remotely
Compare execution of a query:
charch37 :) SELECT max(`Block_Date`) AS `_Date_0`, count() AS `count()` FROM remote('chstrm36', 'eth', 'dex_trades_tx', 'default', '[PWD]') SETTINGS enable_analyzer=0
SELECT
max(Block_Date) AS _Date_0,
count() AS `count()`
FROM remote('chstrm36', 'eth', 'dex_trades_tx', 'default', '[PWD]')
SETTINGS enable_analyzer = 0
Query id: ea088f8d-1dca-4deb-a8ef-513f86bce214
┌────_Date_0─┬───count()─┐
1. │ 2025-06-12 │ 394548119 │
└────────────┴───────────┘
1 row in set. Elapsed: 5.044 sec. Processed 395.86 million rows, 794.61 MB (78.48 million rows/s., 157.54 MB/s.)
Peak memory usage: 1.04 GiB.
charch37 :) SELECT max(`Block_Date`) AS `_Date_0`, count() AS `count()` FROM remote('chstrm36', 'eth', 'dex_trades_tx', 'default', '[PWD]') SETTINGS enable_analyzer=1
SELECT
max(Block_Date) AS _Date_0,
count() AS `count()`
FROM remote('chstrm36', 'eth', 'dex_trades_tx', 'default', '[PWD]')
SETTINGS enable_analyzer = 1
Query id: d0e5543e-413f-436e-a02f-4127f52dc582
Cancelling query.
Query was cancelled.
0 rows in set. Elapsed: 45.494 sec. Processed 28.98 million rows, 80.72 GB (636.94 thousand rows/s., 1.77 GB/s.)
Peak memory usage: 4.47 GiB.
Same happens with using Distributed table.
Difference in huge: note the amount of disk read on the second case, and the query take so long that i can not wait for it
Which ClickHouse versions are affected?
25.3.3.42 / 25.6.1.2609 / 25.4.5.24
How to reproduce
- create remote table
- query with remote() with analyser on and off and compare
Expected performance
Perfomance exected be the same with analyzer ON
Additional context
- when same query executed locally on server , where remote table located, it s executed OK
- when Distributed query executed locally on server , where remote table located, it s executed OK
I also tried to query remote() from latest version 25.6.1.2609
behavior exactly the same, but it shows more stats:
Note here the first line of Gb downloaded - with analyzer OFF this line does not appear
On remote server query with analyser ON is
SELECT max(`__table1`.`Block_Date`) AS `_Date_0`, count() AS `count()` FROM `eth`.`dex_trades_tx` AS `__table1`
and with OFF is
SELECT max(`Block_Date`) AS `_Date_0`, count() AS `count()` FROM `eth`.`dex_trades_tx`
Note that executing both of these queries take same time and resources, no difference.
For me the problem is not in the query it executes on remote server, but looks analyser tries to download something and it is not related to remote query per se