Skip to content

Roadmap 2022 (discussion) #32513

@alexey-milovidov

Description

@alexey-milovidov

This is ClickHouse open-source roadmap 2022.
Descriptions and links to be filled.

This roadmap does not cover the tasks related to infrastructure, orchestration, documentation, marketing, integrations, SaaS, drivers, etc.

See also:

Roadmap 2021: #17623
Roadmap 2020: in Russian

Main Tasks

✔️ Make clickhouse-keeper Production Ready

✔️ It is already feature-complete and being used in production.
✔️ Update documentation to replace ZooKeeper with clickhouse-keeper everywhere.

✔️ Support for Backup and Restore

✔️ Backup of tables, databases, servers and clusters.
✔️ Incremental backups. Support for partial restore.
✔️ Support for pluggable backup storage options.

✔️ Semistructured Data

✔️ JSON data type with automatic type inference and dynamic subcolumns.
✔️ Sparse column format and optimization of functions for sparse columns. #22535
Dynamic selection of column format - full, const, sparse, low cardinality.
✔️ Hybrid wide/compact data part format for huge number of columns.

✔️ Type Inference for Data Import

✔️ Allow to skip column names and types if data format already contains schema (e.g. Parquet, Avro).
✔️ Allow to infer types for text formats (e.g. CSV, TSV, JSONEachRow).

#32455

Support for Transactions

Atomic insert of more than one block or to more than one partition into MergeTree and ReplicatedMergeTree tables.
Atomic insert into table and dependent materialized views. Atomic insert into multiple tables.
Multiple SELECTs from one consistent snapshot.
Atomic insert into distributed table.

✔️ Lightweight DELETE

✔️ Make mutations more lightweight by using delete-masks.
✔️ It won't enable frequent UPDATE/DELETE like in OLTP databases, but will make it more close.

✔️ ### SQL Compatibility Improvements

✔️ Untangle name resolution and query analysis.
✔️ Initial support for correlated subqueries.
✔️ Allow using window functions inside expressions.
✔️ Add compatibility aliases for some window functions, etc.
✔️ Support for GROUPING SETS.

JOIN Improvements

✔️ Support for join reordering.
✔️ Extend the cases when condition pushdown is applicable.
Convert anti-join to NOT IN.
✔️ Use table sorting for DISTINCT optimization.
✔️ Use table sorting for merge JOIN.
✔️ Grace hash join algorithm.

Resource Management

✔️ Memory overcommit (sort and hard memory limits).
✔️ Enable external GROUP BY and ORDER BY by default.
✔️ IO operations scheduler with priorities.
✔️ Make scalar subqueries accountable.
✔️ CPU and network priorities.

Separation of Storage and Compute

✔️ Parallel reading from replicas.
✔️ Dynamic cluster configuration with service discovery.
✔️ Caching of data from object storage.
Simplification of ReplicatedMergeTree.
✔️ Shared metadata storage.

Experimental and Intern Tasks

Streaming Queries

Fix POPULATE for materialized views.
Unification of materialized views, live views and window views.
Allow to set up subscriptions on top of all tables including Merge, Distributed.
✔️ Normalization of Kafka tables with storing offsets in ClickHouse.
✔️ Support for exactly once consumption from Kafka, non-consuming reads and multiple consumers.
Streaming queries with GROUP BY, ORDER BY with windowing criterias.
Persistent queues on top of ClickHouse tables.

Integration with ML/AI

🗑️ Integration with Tensorflow
🗑️ Integration with MADLib

GPU Support

🗑️ Compile expressions to GPU

Unique Key Constraint

User-Defined Data Types

Incremental aggregation in memory

Key-value data marts

Text Classification

Graph Processing

Foreign SQL Dialects in ClickHouse

🗑️ Support for MySQL dialect or Apache Calcite as an option.

✔️ Batch Jobs and Refreshable Materialized Views

✔️ Embedded ClickHouse Engine

Data Hub

Build And Testing Improvements

Testing

✔️ Add tests for AArch64 builds.
✔️ Automated tests for backward compatibility.
✔️ Server-side query fuzzer for all kind of tests.
✔️ Fuzzing of query settings in functional tests.
SQL function-based fuzzer.
Fuzzer of data formats.
✔️ Integrate with SQLogicTest.
Import obfuscated queries from Yandex Metrica.

Builds

✔️ Docker images for AArch64.
✔️ Enable missing libraries for AArch64 builds.
✔️ Add and explore Musl builds.
Build all libraries with our own CMake files.
Embed root certificates to the binary.
Embed DNS resolver to the binary.
Add ClickHouse to Snap, so people will not install obsolete versions by accident.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions