Adding Setting Aggregate function input format to allow Insert queries into tables with AggregateFunction columns#88088
Conversation
|
Same as #88049 . |
|
Workflow [PR], commit [eebf00a] Summary: ❌
|
|
@Avogar @GrigoryPervakov Please review. Fixed the test cases and updated PR. |
Reintroduce aggregate_function_input_format setting to control input format during INSERT operations.
|
@GrigoryPervakov The checks which are failing are not related to my changes. The tests seem to be flaky as mentioned here . |
|
@GrigoryPervakov Thanks for reviewing the PR and approving the changes. The Build failures don't seem to be related to these changes as per this : Reports Can we proceed with merging the PR ? |
Yes, all tests are confirmed as flaky or ran out of time limit |
46edc61
|
@punithns97 hi, could input values for AggegateFunction column be formatted not as string? Or in case if |
@fm4v It accepts real (non-string) arrays for AggregateFunction columns when the input is parsed by a FORMAT that honors aggregate_function_input_format (e.g. JSONEachRow, TabSeparated, CSV,etc). However, it will not implicitly convert SQL VALUES array literals (INSERT ... VALUES (id, [1,2,3])) into an AggregateFunction state . |
| // Single argument - parse the value directly | ||
| auto temp_column = argument_types[0]->createColumn(); | ||
| ReadBufferFromString buf(value_str); | ||
| argument_types[0]->getDefaultSerialization()->deserializeTextCSV(*temp_column, buf, settings); |
There was a problem hiding this comment.
I don't understand this - why CSV?
|
It should work for all input formats, including RowBinary. If it's not the case, please revert this PR and re-implement it. |
|
I can argue that this is especially important for the RowBinary format, and having a partial implementation misses the point. |
This works with only text-based formats. for supporting binary format, I can raise another PR to supplement this ? |
Resolves #87827.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Adds session-level setting
aggregate_function_input_formatto improveINSERTqueries into tables with AggregateFunction columns, allowing insertion of data as serialized state, raw values, or arrays.Documentation entry for user-facing changes
Adds a Session level setting
aggregate_function_input_formatwith the following possible values:state- binary string with the serialized state (the default)value- the format will expect a single value of the argument of the aggregate function, or in the case of multiple arguments, a tuple of them; that will be deserialized to form the relevant statearray- the format will expect an Array of values, as described in the values option above; all the elements of the array will be aggregated to form the stateDetails
The goal of this PR is to allow the usage of
AggregateFunctionto support various other formats like 'JSON', 'CSV', 'TSV'.Resolves #87827.
Example use
For a table with this structure:
The user can
SET aggregate_function_input_format = 'value'and perform queries such as:Or the user can
SET aggregate_function_input_format = 'array':