-
Notifications
You must be signed in to change notification settings - Fork 8.3k
[Question] how to make funnel query run faster? #2096
Copy link
Copy link
Closed
Description
hi, I need to use clickhouse to funnels query
Data
- the table
trajectory_trackless_complexhas 2894082787 rows - pageid: 2538 --> 3242 --> 1682, ( homepage --> listpage --> detailpage )
Use sequenceMatch, throw error
SELECT
sum(step1_condition) AS home,
sum(step2_condition) AS list,
sum(step3_condition) AS detail
FROM
(
SELECT
loginkey,
max(pageid = 2538) AS step1_condition,
sequenceMatch('(?1).*(?2)')(toDateTime(createtime), pageid = 2538, pageid = 3242) AS step2_condition,
sequenceMatch('(?1).*(?2).*(?3)')(toDateTime(createtime), pageid = 2538, pageid = 3242, pageid = 1682) AS step3_condition
FROM datacenter.trajectory_trackless_complex
WHERE intotime = '2018-03-01'
GROUP BY loginkey
)
↗ Progress: 875.45 million rows, 12.42 GB (76.01 million rows/s., 1.08 GB/s.) ██████████████████████████████████████████▊ 98%Received exception from server (version 1.1.54342):
Code: 160. DB::Exception: Received from hadoo:9000, 10.12.180.114. DB::Exception: Pattern application proves too difficult, exceeding max iterations (1000000).
Query sql
SELECT
sum(step1_time != 0) AS home,
sum(step2_time != 0) AS list,
sum(step3_time != 0) AS detail
FROM
(
SELECT
loginkey,
groupArray(pageid) AS events,
groupArray(createtime) AS events_times,
arrayFilter((time, name) -> (name = 2538), events_times, events)[1] AS step1_time,
arrayFilter((time, name) -> ((name = 3242) AND (time >= step1_time) AND (step1_time != 0)), events_times, events)[1] AS step2_time,
arrayFilter((time, name) -> ((name = 1682) AND (time >= step2_time) AND (step2_time != 0)), events_times, events)[1] AS step3_time
FROM
(
SELECT
pageid,
createtime,
loginkey
FROM datacenter.trajectory_trackless_complex
WHERE pageid IN (2538, 3242, 1682)
ORDER BY createtime ASC
)
GROUP BY loginkey
)
Execute result
Question
how to optimize the query ?
Thank you very match
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
