-
Notifications
You must be signed in to change notification settings - Fork 8.3k
A setting aggregate_function_input_format to simplify insertion into columns with the AggregateFunction data type #87827
Description
Company or project name
ClickHouse
Use case
AggregateFunction is a parameterized data type, representing the internal state of aggregate functions in ClickHouse. It allows storing, manipulating, and merging the intermediate states of aggregate functions like first-class values, enabling incremental aggregation. This is a unique feature of ClickHouse, available since its first release.
However, AggregateFunction's serialization format is an implementation detail of each aggregate function. It is represented as a binary string in all data formats, which makes direct insertion of values of this type not feasible (the users have to do INSERT SELECT instead, which calculates the aggregation states in the SELECT query).
The goal is to enable direct insertion by providing explicit values of arguments to the aggregate function, which will be aggregated to form the state, much like the initializeAggregation and arrayReduce functions do.
Describe the solution you'd like
Add a setting, aggregate_function_input_format, with the following possible values:
state- binary string with the serialized state (the default);value- the format will expect a single value of the argument of the aggregate function, or in the case of multiple arguments, a tuple of them; they will be deserialized using the corresponding IDataType or DataTypeTuple of them.array- the format will expect an Array of values, as described in thevaluesoption above; all the elements of the array will be aggregated to form the state.
For example, given the table of this structure:
user_id UInt64,
avg_session_length AggregateFunction(avg, UInt32)
the user can SET aggregate_function_input_format = 'value'
and insert the data as follows:
INSERT INTO table FORMAT CSV
123,456
or the user can SET aggregate_function_input_format = 'array'
and insert the data as follows:
INSERT INTO table FORMAT CSV
123,[456,789]
Describe alternatives you've considered
No response
Additional context
This task is listed in the official ClickHouse Roadmap 2020.