Skip to content

Distributed Sends got stuck #18369

@wedi-dev

Description

@wedi-dev

We have big problem in one of our production nodes. The data in Distribution Table is not send to the Data Tables. We had to change weight of shards which consisted of that node and detached that node from LB. We ended with 30GB of data in Distributed Table and it was not send for over 5 days. We investigated that Clickhouse tries to process those files and send them but it ends with "Cannot read all data. Bytes read: 27523. Bytes expected: 35560. (version 20.3.8.53 (official build))"

Then we checked bin files with clickhouse-local using engine=File(Distributed) to open those files and we got same error. There are 1,1mln bin files and 300 of them are broken and freezes Distributed Sends.

Why Clickhouse does not move those files to the broken Directory but gets stucked?

version 20.3.8.53 (official build)

Expected behavior
Broken bin files moved to broken and Distributed Sends continue for valid bin files

Error message and/or stacktrace

clickhouse-local -q "create table table (date Date) engine=File(Distributed, '76914002.bin'); select * from table limit 1" --format Vertical
Code: 33, e.displayText() = DB::Exception: Cannot read all data. Bytes read: 27523. Bytes expected: 35560. (version 20.3.8.53 (official build))

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugConfirmed user-visible misbehaviour in official releasecomp-distributedDistributed table engine & query routing across shards (sharding/load balancing).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions