Skip to content

The problem of data part merging while using zero copy replication #31843

@zxealous

Description

@zxealous

I use zero copy replication by HDFS for test , replicas create table use the same storage_policy so they will use the same data on HDFS, the directory is shared. But now, I met a problem, the data part merge cannot be done?
I understand replicas should use shared directory on HDFS, but why data part cannot be merged? Is my operation wrong?

`2021.11.24 20:14:58.443628 [ 48666 ] {} default.hits_v1_1116_1 (2cf38cbb-8471-4d06-acf3-8cbb84711d06): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(DB::StorageReplicatedMergeTree::LogEntryPtr &) const: Code: 84. DB::Exception: Directory /home/disk7/zcy/hdfs-ck/data-01/data/disks/hdfs1/store/2cf/2cf38cbb-8471-4d06-acf3-8cbb84711d06/tmp_merge_201403_6_11_1/ already exists. (DIRECTORY_ALREADY_EXISTS), Stack trace (when copying this message, always include the lines below):

Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int) @ 0x12dc58ec in ?
DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int, bool) @ 0x9616ada in ?
DB::MergeTreeDataMergerMutator::mergePartsToTemporaryPart(DB::FutureMergedMutatedPart const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::MergeListElement&, std::__1::shared_ptrDB::RWLockImpl::LockHolderImpl&, long, std::__1::shared_ptr<DB::Context const>, std::__1::unique_ptr<DB::IReservation, std::__1::default_deleteDB::IReservation > const&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > > > const&, DB::MergeTreeData::MergingParams const&, DB::IMergeTreeDataPart const*, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) @ 0x1099e802 in ?
DB::StorageReplicatedMergeTree::tryExecuteMerge(DB::ReplicatedMergeTreeLogEntry const&) @ 0x107483d8 in ?
DB::StorageReplicatedMergeTree::executeLogEntry(DB::ReplicatedMergeTreeLogEntry&) @ 0x1073cae9 in ?
bool std::__1::__function::__policy_invoker<bool (std::__1::shared_ptrDB::ReplicatedMergeTreeLogEntry&)>::__call_impl<std::__1::__function::__default_alloc_func<DB::StorageReplicatedMergeTree::processQueueEntry(std::__1::shared_ptrDB::ReplicatedMergeTreeQueue::SelectedEntry)::$_14, bool (std::__1::shared_ptrDB::ReplicatedMergeTreeLogEntry&)> >(std::__1::__function::__policy_storage const*, std::__1::shared_ptrDB::ReplicatedMergeTreeLogEntry&) @ 0x107c2bbf in ?
DB::ReplicatedMergeTreeQueue::processEntry(std::__1::function<std::__1::shared_ptrzkutil::ZooKeeper ()>, std::__1::shared_ptrDB::ReplicatedMergeTreeLogEntry&, std::__1::function<bool (std::__1::shared_ptrDB::ReplicatedMergeTreeLogEntry&)>) @ 0x10ab85a9 in ?
DB::StorageReplicatedMergeTree::processQueueEntry(std::__1::shared_ptrDB::ReplicatedMergeTreeQueue::SelectedEntry) @ 0x10773ebd in ?
void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::IBackgroundJobExecutor::execute(DB::JobAndPool)::$_0, void ()> >(std::__1::__function::__policy_storage const*) @ 0x108ddcf6 in ?
ThreadPoolImpl::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0x964c28b in ?
ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda0'()>(void&&, void ThreadPoolImpl::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda0'()&&...)::'lambda'()::operator()() @ 0x964d8ff in ?
ThreadPoolImplstd::__1::thread::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x964a9ab in ?
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, void ThreadPoolImplstd::__1::thread::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda0'()> >(void*) @ 0x964cad3 in ?
start_thread @ 0x318b207851 in ?
__clone @ 0x318aee767d in ?
(version 21.10.2.1)`

Metadata

Metadata

Assignees

Labels

comp-object-storageObject storage connectivity (S3/GCS/Azure) including credentials, retries, multipart, etc.experimental featureBug in the feature that should not be used in production

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions