-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Distributed Sends got stuck #18369
Description
We have big problem in one of our production nodes. The data in Distribution Table is not send to the Data Tables. We had to change weight of shards which consisted of that node and detached that node from LB. We ended with 30GB of data in Distributed Table and it was not send for over 5 days. We investigated that Clickhouse tries to process those files and send them but it ends with "Cannot read all data. Bytes read: 27523. Bytes expected: 35560. (version 20.3.8.53 (official build))"
Then we checked bin files with clickhouse-local using engine=File(Distributed) to open those files and we got same error. There are 1,1mln bin files and 300 of them are broken and freezes Distributed Sends.
Why Clickhouse does not move those files to the broken Directory but gets stucked?
version 20.3.8.53 (official build)
Expected behavior
Broken bin files moved to broken and Distributed Sends continue for valid bin files
Error message and/or stacktrace
clickhouse-local -q "create table table (date Date) engine=File(Distributed, '76914002.bin'); select * from table limit 1" --format Vertical
Code: 33, e.displayText() = DB::Exception: Cannot read all data. Bytes read: 27523. Bytes expected: 35560. (version 20.3.8.53 (official build))