-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Materialise view seems to be triggered with empty insert even when no new data matches WHERE clause #78747
Description
Company or project name
ClickHouse
Describe the unexpected behaviour
ClickHouse has awesome URL function ... and materialise view...
Recently i was playing with the idea of using mat view + url function for "poor man alert". something like this:
CREATE MATERIALIZED VIEW default.clickhouse_security_alert
(
`text` String
)
ENGINE = URL('https://hooks.slack.com/services/SLACK_WEBHOOK', 'JSONEachRow')
AS SELECT
if(length(domain_names) = 0, '', concat('Domains:\n', arrayStringConcat(arrayMap((d, f) -> concat('- ', d, ' (', formatDateTime(f, '%Y-%m-%d'), ')'), domain_names, fetch_dates), '\n'))) AS text
FROM
(
SELECT
groupArray(domain_name) AS domain_names,
groupArray(fetch_date) AS fetch_dates
FROM default.new_dns
WHERE domain_name like '%interestingpattern%'
)
which will alert me on domain names that is registered with "interestingpattern" in it.
Everything works but i notice that even if there is no interestingpattern match, the URL trigger still happens and it seems ClickHouse tried to insert empty value ... Slack drop these and throw 400 error but I don't see why we need to waste cpu cycle for that and also this means other type of webhook won't work well so I think if we can ignore trigger for mat view update when no data need to be inserted, it would be awesome :)
How to reproduce
-- Create our table
CREATE TABLE default.mytable (a Int32) ORDER BY a;
-- Create our mat view alert
CREATE MATERIALIZED VIEW default.mymatviewalert (a Int32)
ENGINE = URL('https://mytable.requestcatcher.com/', 'JSONEACHRow')
AS SELECT a
FROM default.mytable
WHERE a > 10;
-- Insert data that does not match WHERE clause
insert into mytable values(1),(2)
-- Insert data that matches WHERE clause
insert into mytable values(100),(20),(10000)
Then you can visit requestcatcher to see that in both inserts, the URL table function was called but in the first request, nothing was inserted and on second request, new values (100,20,1000) are POST to the URL.
Expected behavior
Ideally the first request should not even be fired and we can have a neat trick to create webhook alert straight from inside clickhouse ;)