Add -SimpleState combinator#16853
Conversation
|
I also prefer to do to avoid typing definitions for every columns. And i was missing that functionality. |
|
@filimonov You can achieve that via column transformers It's in question whether I should extend it to |
@amosbird what is not possible to do right now that will be allowed after this patch? Right now the type for the materialized view will use original type: create table data1 (key Int, value SimpleAggregateFunction(max, Int)) engineAggregatingMergeTree() order by key;
create materialized view data1_mv engine=AggregatingMergeTree() order by key as select * from data1;
SHOW CREATE TABLE data1_mv2;
┌─statement────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE MATERIALIZED VIEW default.data1_mv2
(
`key` Int32,
`value` SimpleAggregateFunction(max, Int32)
)
ENGINE = AggregatingMergeTree()
ORDER BY key
SETTINGS index_granularity = 8192 AS
SELECT *
FROM default.data1 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- same types for inner table.Something like this only? create table data2 engine=AggregatingMergeTree() order by n as select number n, anySimpleState(number) a from numbers(1) group by number;
SHOW CREATE TABLE data2 FORMAT Pretty;
┌─statement─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE default.data2
(
`n` UInt64,
`a` SimpleAggregateFunction(any, UInt64) /* to get SimpleAggregateFunction here? */
)
ENGINE = AggregatingMergeTree()
ORDER BY n
SETTINGS index_granularity = 8192 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
I would say something like this |
Doesn't this the same? (But a little bit longer though) As for performance: INSERT INTO FUNCTION null('key Int, value SimpleAggregateFunction(max, Int)') SELECT
any(number) AS key,
maxSimpleState(number)
FROM numbers(10000000000)
SETTINGS max_threads = 1
Query id: c866ac71-18df-4a94-80af-bc26c4f761dd
Ok.
0 rows in set. Elapsed: 32.275 sec. Processed 10.00 billion rows, 80.01 GB (309.87 million rows/s., 2.48 GB/s.)
INSERT INTO FUNCTION null('key Int, value SimpleAggregateFunction(max, Int)') SELECT
number AS key,
CAST(number, 'SimpleAggregateFunction(max, Int)') AS value
FROM numbers(10000000000)
SETTINGS max_threads = 1
Query id: 81593891-8a5c-4654-a042-5caf54af782e
Ok.
0 rows in set. Elapsed: 14.828 sec. Processed 10.00 billion rows, 80.01 GB (674.47 million rows/s., 5.40 GB/s.) (And the using |
That's the motivation (and also refer to the example of column transformers).
It's a good warning but not a persuasive example. |
|
test error looks weird |
Got it, thanks!
Yeah, this is synthetic and I posted them just to add at least some details |
Indeed: Looks like previous query finished, but new query (by some reason) was started from the Data/Scalar packet |
|
|
Test failures seem unrelated. |
CI reports [1]:
Indirect leak of 648 byte(s) in 9 object(s) allocated from:
...
2 0x12b96503 in DB::AggregateFunctionSimpleState::getReturnType() const obj-x86_64-linux-gnu/../src/AggregateFunctions/AggregateFunctionSimpleState.h:47:15
...
[1]: https://s3.amazonaws.com/clickhouse-test-reports/33957/08f4f45fd9da923ae3e3fdd8a527c297d35247eb/stress_test__address__actions_.html
After we can get this query by using query_log artifact:
$ wget https://s3.amazonaws.com/clickhouse-test-reports/33957/08f4f45fd9da923ae3e3fdd8a527c297d35247eb/stress_test__address__actions_/query_log_dump.tar
$ tar -xf query_log_dump.tar
$ clickhouse-local --path var/lib/clickhouse/
SELECT query
FROM system.query_log
ARRAY JOIN used_aggregate_function_combinators AS func
WHERE has(used_aggregate_functions, 'groupBitOr') AND has(used_aggregate_function_combinators, 'SimpleState') AND (type != 'QueryStart')
Query id: 5b7722b3-f77e-4e7e-bd0b-586d6d32a899
┌─query────────────────────────────────────────────────────────────────────────────┐
│ with groupBitOrSimpleState(number) as c select toTypeName(c), c from numbers(1); │
└──────────────────────────────────────────────────────────────────────────────────┘
Fixes: 01570_aggregator_combinator_simple_state.sql
Fixes: ClickHouse#16853
Signed-off-by: Azat Khuzhin <[email protected]>
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Provide a new aggregator combinator :
-SimpleStateto build SimpleAggregateFunction types via query. It's useful for defining MaterializedView of AggregatingMergeTree engine, and will benefit projections too.Detailed description / Documentation draft:
Documents are updated.
If you apply this combinator, the aggregate function returns the same value but with a different type. This is an
SimpleAggregateFunction(...)that can be stored in a table to work with AggregatingMergeTree table engines.