Skip to content

Aggregate function input format#88049

Closed
punithsubashchandra wants to merge 10 commits intoClickHouse:masterfrom
punithsubashchandra:aggregate_function_input_format
Closed

Aggregate function input format#88049
punithsubashchandra wants to merge 10 commits intoClickHouse:masterfrom
punithsubashchandra:aggregate_function_input_format

Conversation

@punithsubashchandra
Copy link
Copy Markdown

@punithsubashchandra punithsubashchandra commented Oct 2, 2025

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

This change adds a new session-level setting in FormatSettings called aggregate_function_input_format.
It supports the following values:

state (default)
value
array

It improves INSERT queries into tables with AggregateFunction columns, allowing you to insert data either as serialized state, as raw values, as arrays of values or as json .

Fixes #87827.

Documentation entry for user-facing changes :

Adds a Session level setting aggregate_function_input_formatwith the following possible values:
state - binary string with the serialized state (the default);
value - the format will expect a single value of the argument of the aggregate function, or in the case of multiple arguments, a tuple of them; that will be deserialized to form the relevant state.
array - the format will expect an Array of values, as described in the values option above; all the elements of the array will be aggregated to form the state.

The goal of this PR is to allow the usage of AggregateFunction to support various other formats like 'JSON', 'CSV' ,'TSV'

Example use: A query or command:

For a table with this structure :
CREATE TABLE test_agg_single ( user_id UInt64, avg_session_length AggregateFunction(avg, UInt32) )

the user can SET aggregate_function_input_format = 'value' and perform queries such as :
INSERT INTO test_agg_single VALUES (124, '456'), (125, '789'), (126, '321');

or
the user can SET aggregate_function_input_format = 'array'
INSERT INTO test_agg_single VALUES (127, '[100,200,300]'), (128, '[400,500]'), (129, '[600]');

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Avogar Avogar self-assigned this Oct 3, 2025
@Avogar Avogar added the can be tested Allows running workflows for external contributors label Oct 3, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Oct 3, 2025

Workflow [PR], commit [d9d47db]

Summary:
15 failures out of 118 shown:

job_name test_name status info comment
Config Workflow failure
python3 ./ci/jobs/scripts/workflow_hooks/pr_description.py failure
Dockers Build (amd) dropped
Dockers Build (arm) dropped
Dockers Build (multiplatform manifest) dropped
Style check dropped
Docs check dropped
Fast test dropped
Build (arm_tidy) dropped
Build (amd_debug) dropped
Build (amd_asan) dropped
Build (amd_tsan) dropped
Build (amd_msan) dropped
Build (amd_ubsan) dropped
Build (amd_binary) dropped
Build (arm_asan) dropped

@UnamedRus
Copy link
Copy Markdown
Contributor

There is also opposite requirement. #29109
Automatic finalization of aggregate function states to improve usability of UX with bi like tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A setting aggregate_function_input_format to simplify insertion into columns with the AggregateFunction data type

4 participants