Skip to content

The system is not fully utilized when importing data from clickhouse-client. #42372

@alexey-milovidov

Description

@alexey-milovidov

Describe the situation

See #42363

CREATE TABLE passwords
(
    `hash` FixedString(20) CODEC(ZSTD(6)),
    `count` UInt32 CODEC(ZSTD(6))
)
ENGINE = MergeTree
ORDER BY hash

clickhouse-client --progress --query "INSERT INTO passwords WITH splitByChar(':', line) AS columns SELECT unhex(columns[1]), replaceOne(columns[2], '\r', '') FROM input('line String') FORMAT LineAsString" < pwned-passwords-sha1-ordered-by-hash-v8.txt

clickhouse-client is using only 20..30% CPU and clickhouse-server is using 300..400% CPU, but I have a much higher number of CPUs.

To solve this problem, clickhouse-server can respect max_insert_threads by accepting more blocks of data from the client while processing previous blocks of data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancewarmup taskThe task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions