Fuse multi quantile funcs with same arguments to one function.#26657
Fuse multi quantile funcs with same arguments to one function.#26657vdimir merged 14 commits intoClickHouse:masterfrom
Conversation
vdimir
left a comment
There was a problem hiding this comment.
Thank you for your changes!
Some comments, main points are:
- it's good to add some tests with aliased columns
- fuzzer repport
| set optimize_fuse_quantile=true; | ||
| SELECT 'quantile:'; | ||
| EXPLAIN SYNTAX SELECT quantile(0.2)(d), quantile(0.3)(d) FROM datetime; | ||
| SELECT quantile(0.2)(d), quantile(0.3)(d) FROM datetime; |
There was a problem hiding this comment.
I suppose it's good to add tests were aliases are used like SELECT b, quantile(0.5)(x) as a, quantile(0.9)(x) as y, quantile(0.95)(x) FROM (select number as x, number % 2 as b from numbers(10)) group by b;
and also where gathered function are not only in select list, but in order by, join on or other section
| SELECT '---------After fuse result-----------'; | ||
| set optimize_fuse_quantile=true; | ||
| SELECT 'quantile:'; | ||
| EXPLAIN SYNTAX SELECT quantile(0.2)(d), quantile(0.3)(d) FROM datetime; |
There was a problem hiding this comment.
Why do we got two AS y in the result in (quantiles(0.5, 0.9, 0.95)(x) AS y)[1] AS y?
EXPLAIN SYNTAX
SELECT
b,
quantile(0.5)(x) AS y,
quantile(0.9)(x),
quantile(0.95)(x)
FROM
(
SELECT
number AS x,
number % 2 AS b
FROM numbers(10)
)
GROUP BY b;
┌─explain──────────────────────────────────────────┐
│ SELECT │
│ b, │
│ (quantiles(0.5, 0.9, 0.95)(x) AS y)[1] AS y, │
│ y[2], │
│ y[3] │
│ FROM │
│ ( │
│ SELECT │
│ number AS x, │
│ number % 2 AS b │
│ FROM numbers(10) │
│ ) │
│ GROUP BY b │
└──────────────────────────────────────────────────┘
| {NameQuantileBFloat16::name, NameQuantilesBFloat16::name} | ||
| }; | ||
|
|
||
| /// Gather all the quantilexxx functions |
There was a problem hiding this comment.
| /// Gather all the quantilexxx functions | |
| /// Gather all the `quantile*` functions |
src/Interpreters/TreeOptimizer.cpp
Outdated
| for (auto * func : functions) | ||
| { | ||
| func->name = "arrayElement"; |
There was a problem hiding this comment.
Please look at the fuzzer failure, it finds error somewhere here
|
|
||
| SELECT '---------After fuse result-----------'; | ||
| set optimize_fuse_quantile=true; | ||
| SELECT 'quantile:'; |
There was a problem hiding this comment.
Also it may be a bit consing that we have column name quantile(0.9)(x) in query SELECT ..., quantile(0.9)(x), ... but in result it prints arrayElement(quantiles(0.5, 0.9, 0.95)(x), 1). We can try to add alias ... AS "quantile(0.9)(x)". But I'm not sure that it's a big issue, maybe we can leave it as is, though.
| auto it = fuse_quantile.find(function.name); | ||
| if (it != fuse_quantile.end()) | ||
| { | ||
| it->second.addFuncNode(&function); | ||
| } | ||
| else | ||
| { | ||
| FuseQuantileAggregatesData funcs{}; | ||
| funcs.addFuncNode(&function); | ||
| fuse_quantile[function.name] = funcs; | ||
| } |
There was a problem hiding this comment.
You can simply write fuse_quantile[function.name].addFuncNode, map entry will be initialized with default FuseQuantileAggregatesData constructor
https://godbolt.org/z/7KGdGadz8
| auto it = fuse_quantile.find(function.name); | |
| if (it != fuse_quantile.end()) | |
| { | |
| it->second.addFuncNode(&function); | |
| } | |
| else | |
| { | |
| FuseQuantileAggregatesData funcs{}; | |
| funcs.addFuncNode(&function); | |
| fuse_quantile[function.name] = funcs; | |
| } | |
| fuse_quantile[function.name].addFuncNode(&function); |
|
@hexiaoting Do you wanna continue working on this PR? Or I may try to finish it |
sorry for the delay, I will continue |
|
Sorry, I missed last commit, going to review it. Following query is not working with option enabled: But maybe it's ok, because query looks weired and we have option to disable it. Also I'm thinking of creating more general option, like that enables all fuse syntax optimization (we already have one) not to create lots of for each function. |
I will use only one fuse option setting. |
This reverts commit bbd7799.
…unt_avg` and `optimize_fuse_quantile`
16c69ac to
661a548
Compare
|
Some Fuzzer with UBSan failures not related to changes, reproduces on master #29075 |
a40d3e7 to
091ce15
Compare
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
If query has multiple quantile aggregate functions with the same arguments but different level parameter:
SELECT quantile(0.5)(x), quantile(0.9)(x), quantile(0.95)(x) FROM ...Rewrite it to
SELECT quantiles(0.5, 0.9, 0.95)(x)[1], quantiles(0.5, 0.9, 0.95)(x)[2], quantiles(0.5, 0.9, 0.95)(x)[3] FROM ...So the aggregation will share the state.
The functions that can be fused in this implementation:
quantile, quantileDeterministic, quantileTiming, quantileTDigest, quantileTDigestWeighted, quantileExact, quantileExactWeighted, quantileBFloat16close: #25961