Fix broken part error while restoring to s3_plain_rewritable#66881
Conversation
There was a problem hiding this comment.
This is the first part of the fix. object_storage.isWriteOnce() returns false for s3_plain_rewritable disks, so it could actually remove restored files immediately after copying them to a destination disk.
Checking object_storage.isPlain() instead is more correct here and it also corresponds to DiskObjectStorageTransaction::writeFile().
There was a problem hiding this comment.
This is the second part of the fix. During RESTORE it copies parts from a backup to temporary directories on destination disks, and then attaches those parts to tables.
Earlies it created those temporary directories as subfolders of the /tmp/ folder on the destination disks. That didn't work good for the s3_plain_rewritable disk type because that kind of disk can't move a folder to any place, it can only rename it (keeping its parent the same).
So I changed that in this PR and now temporary directories are created in the data folder of a destination table, for example
/var/lib/clickhouse/data/db_name/table_name/tmp_restore_all_0_1_1_0-XXXXXXXX
So to attach such folders they have to be only renamed (and not to be moved to another location).
There was a problem hiding this comment.
I'm somewhat worried this change affects all disk types, not just plain_rewritable, though I can't think of any actual pitfalls. The advantage of the /tmp/ folder is that it provides a clear separation between temporary and persistent fs paths. Maybe scope this change to the plain_rewritable disk only?
There was a problem hiding this comment.
Actually the new way of naming temporary directories corresponds to how temporary directories have always been named by the INSERT command:
/var/lib/clickhouse/data/db_name/table_name/tmp_insert_all_0_1_1_0
So I think it could be just more consistent to use the same idea for RESTORE.
There was a problem hiding this comment.
The error message columns.txt is not found was not very helpful while I was investigating this issue. So I decided to improve the error message in case this error ever comes back.
|
This is an automated comment for commit 2e25808 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
use "/var/lib/clickhouse/data/test/table/tmp_restore_all_0_1_1_0-XXXXXXXX" instead of "/tmp/XXXXXXXX/data/test/table/0_1_1_0" (because directories can only be renamed in s3_plain_rewritable, and not moved to another parent directory).
8b200c1 to
eb519c5
Compare
|
|
|
The |
|
LGTM, but the tests have failed :( I'll look at them in the morning. Please also add 'pr-must-backport-cloud' tag. |
Test failures are unrelated:
|
Changelog category:
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix broken part error while restoring to a
s3_plain_rewritabledisk.This PR fixes mysterious error
columns.txt is not foundwhile restoring parts to a disk with types3_plain_rewritable.