Supporting parallel execution for With clause

`cdp_tags` is a table where bitmap intermediate state is stored as a column:

```
CREATE TABLE cdp_tags ( \
    tag_id String, \
    mid_seqs AggregateFunction(groupBitmap, UInt32), \
    cardinality UInt32 \
) engine=ReplacingMergeTree() \
ORDER BY (tag_id) SETTINGS index_granularity=1;
```
A simple bitmap aggregation could return fast, for example, only 0.254s is required for both IO and bitmap AND aggregation:
```
SELECT groupBitmapAnd(mid_seqs)
FROM cdp_tags
WHERE has(['tag1', 'is_member'], tag_id)

┌─groupBitmapAnd(mid_seqs)─┐
│                 16614356 │
└──────────────────────────┘

1 rows in set. Elapsed: 0.254 sec. Processed 28.21 thousand rows, 1.74 MB (110.86 thousand rows/s., 6.83 MB/s.)
```

However, if I move that query into WITH clause:

```
WITH
    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag1', 'is_member'], tag_id)
    ) AS bm1
SELECT order_complete_time
FROM orders
WHERE (order_complete_time >= '2019-01-01 00:00:00') AND (order_complete_time <= '2019-01-01 00:01:00')

703 rows in set. Elapsed: 3.173 sec. Processed 6.20 thousand rows, 24.78 KB (5.28 thousand rows/s., 21.13 KB/s.)
```
Here, I cut the time range as much as possible such that query on table `orders` itself does not affect the overall performance, I just want to evaluate the overhead of WITH clause, then the above time is increased more than tens of times:

Furthermore, if I introduce more operation into WITH clause:

```
WITH

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag1', 'is_member'], tag_id)
    ) AS bm1,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag2', 'is_member'], tag_id)
    ) AS bm2
SELECT order_complete_time
FROM orders
WHERE (order_complete_time >= '2019-01-01 00:00:00') AND (order_complete_time <= '2019-01-01 00:01:00')

703 rows in set. Elapsed: 4.297 sec. Processed 6.20 thousand rows, 24.78 KB (1.44 thousand rows/s., 5.77 KB/s.)
```
The time elapsed is increased near linearly. 

What's more, if I add more:
```
WITH

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag1', 'is_member'], tag_id)
    ) AS bm1,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag2', 'is_member'], tag_id)
    ) AS bm2,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag3', 'is_member'], tag_id)
    ) AS bm3,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag4', 'is_member'], tag_id)
    ) AS bm4,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag5', 'is_member'], tag_id)
    ) AS bm5,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag6', 'is_member'], tag_id)
    ) AS bm6,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag7', 'is_member'], tag_id)
    ) AS bm7,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag8', 'is_member'], tag_id)
    ) AS bm8,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag9', 'is_member'], tag_id)
    ) AS bm9,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag10', 'is_member'], tag_id)
    ) AS bm10,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag11', 'is_member'], tag_id)
    ) AS bm11,

    (
        SELECT groupBitmapAndState(mid_seqs)
        FROM cdp_tags
        WHERE has(['tag12', 'is_member'], tag_id)
    ) AS bm12
SELECT order_complete_time
FROM orders
WHERE (order_complete_time >= '2019-01-01 00:00:00') AND (order_complete_time <= '2019-01-01 00:01:00')

703 rows in set. Elapsed: 24.021 sec. Processed 6.20 thousand rows, 24.78 KB (257.89 rows/s., 1.03 KB/s.)
```

The time is near linearly relevant to the number of aggregation state---there are 12 of them in the above query.

Therefore, there are two kinds of approaches to improve the performance:
1. Find out the reason why the overhead is increased from around 0.2s to above 2s if the aggregation state query is added into WITH clause.  

2. Having the WITH clause executed in parallel. Currently, they are executed sequentially and therefore the query preparation stage has occupied such a large percentage. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting parallel execution for With clause #7329

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Supporting parallel execution for With clause #7329

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions