Skip to content

Array(LowCardinality(String)) has() function slower #10880

@yancheng44

Description

@yancheng44

hello,

I store the exp_id list in an Array using Array(LowCardinality(String)) for our A/B test 。
example exps array data: ['zeyudu_rec_6096_1','default','zeyudu_rec_6096_2','zeyudu_rec_6096_3',....]

My Query:

SELECT
    multiIf(has(exps_lowcard, 'zeyudu_rec_6096_1'), 'control', has(exps_lowcard, 'zeyudu_rec_6096_2'), '6096_2', 'other') AS expr_model_v20,
    SUM(multiIf(windowType = 'CLICK', ((((4 * hasFollow) + hasLike) + hasComment) + (4 * hasShare)) + hasCollect, NULL) * SampleWeight) AS `SUM(ces)`
FROM default.explore_feed_bitmap
GROUP BY expr_model_v20

result: 1.1E rows taken 34 seconds

I make anther test :
I manage an external exp_id dictionay and then store the exp_id list like [1001,3001,4002,6001,6002.....]

test query:

SELECT
    multiIf(has(exps_array, 6001), 'control', has(exps_array, 6002), '6002', 'other') AS expr_model_v20,
    SUM(multiIf(windowType = 'CLICK', ((((4 * hasFollow) + hasLike) + hasComment) + (4 * hasShare)) + hasCollect, NULL) * SampleWeight) AS `SUM(ces)`
FROM default.explore_feed_bitmap
GROUP BY expr_model_v20

result: 1.1E rows taken 3.2 seconds ,

clickhouser version: 19.17.4.11

can you help check this issue. I think the LowCardinality Array should has the close performance with an Array stores Int。

If you need more information, I will update .

thanks
johnny

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions