Versioning of AggregateFunction states.

**Use case**

Sometimes we have to change serialization format of AggregateFunction states due to bugs or inefficiencies. It should not compromise backward compatibility.

Currently we do it only in exceptional cases:

When this aggregate function is released just recently and rarely used. And we have to put a warning about backward compatibility in changelog.

Sometimes we just miss changes by mistake.

**Proposal**

Methods `IAggregateFunction::serialize`, `IAggregateFunction::deserialize` will take additional argument with version. These methods have to support serialization and deserialization with all known versions.

When user creates a data type `AggregateFunction(...)`, e.g. `AggregateFunction(avg, UInt64)`, we transform it adding parameter with version at front with the most recent version, e.g. `AggregateFunction(v1, avg, UInt64)`.

The user will see the data type with version number in SHOW CREATE TABLE, DESCRIBE TABLE, etc.

When data type `AggregateFunction(...)` without version is already specified in table definition or in serialization formats (Native), version 0 is assumed implicitly.

When sending data to the client with native protocol, the revision of the client is taken into account. `IAggregateFunction` should have a method to determine the maximum supported version according to the client revision. The version of `AggregateFunction` is changed that way. If `AggregateFunction` data type will have version zero, it is not printed in data type name.

**Scenarios**

1. Server sends data to old client. Should work seamless.
2. Server sends data to old client that is actually another server that initiated distributed query. Should work seamless.
3. We have a table with columns of AggregateFunction data type stored inside; then upgraded the server and continue to read and write to that table. Should work seamless.
4. We have a table with columns of AggregateFunction data type stored inside; then upgraded the server and continue to read and write to that table. Then downgraded the server and continued to read and write to that table. Should work seamless.
5. We have created dump in format TSVWithNamesAndTypes, CSVWithNamesAndTypes, etc. on old server, then trying to upload it to new server. Should work seamless.
6. We have created dump in format without data types (like TSV, RowBinary) on old server. Then trying to upload it to new server. It may require user to explicitly specify version in data type when creating a table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Versioning of AggregateFunction states. #12552

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Versioning of AggregateFunction states. #12552

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions