Write current batch for distributed send atomically (using .tmp + rename)#7600
Conversation
8642371 to
500d968
Compare
There was a problem hiding this comment.
What will happen if the file already exists? Need a comment that it will truncate the existing file.
Also need a comment about what will hapen with obsolete .tmp files (nothing and user should clean manually).
There was a problem hiding this comment.
Also need a comment about what will hapen with obsolete .tmp files (nothing and user should clean manually).
Should user care about these .tmp files, if they will be simply overwritten?
There was a problem hiding this comment.
Should user care about these .tmp files, if they will be simply overwritten?
No. But as far as I remember, we use auto-increment, so the next file will have different name.
Will be fixed after you merge with master. |
500d968 to
f9f0edf
Compare
There was a problem hiding this comment.
But this is a race condition. Truncate (the default) is better.
There was a problem hiding this comment.
This is simply to put a message into the log
…ame)
Otherwise the following can happen after reboot:
2019.11.01 11:46:12.217143 [ 187 ] {} <Error> dist.Distributed.DirectoryMonitor: Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected \n before: S\'^A\0^]\0\0<BE>4^A\0r<87>\0\0<A2><D7>^D^Y\0<F2>{^E<CD>\0\0Hy\0\0<F2>^_^C\0^_&\0\0<FF><D3>\0\0
<8D><91>\0\0<C0>9\0\0<C0><B0>^A\0^G<AA>\0\0<B5><FE>^A\0<BF><A7>^A\0<9B><CB>^A\0I^R^A\0<B7><AB>^A\0<BC><8F>\0\0˲^B\0Zy\0\0<94><AA>\0\0<98>
<8F>\0\0\f<A5>\0\0^QN\0\0<E3><C6>\0\0<B1>6^B\0ɳ\0\0W<99>\0\0<B9><A2>\0\0:<BB>\0\0)<B1>\0\0#<8B>\0\0aW\0\0<ED>#\0\0<F1>@\0\0ˀ^B\0<D7><FC>\0\0<DF>, Stack trace:
0. 0x559e27222e60 StackTrace::StackTrace() /usr/bin/clickhouse
1. 0x559e27222c45 DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) /usr/bin/clickhouse
2. 0x559e26de4473 ? /usr/bin/clickhouse
3. 0x559e272494b5 DB::assertString(char const*, DB::ReadBuffer&) /usr/bin/clickhouse
4. 0x559e2a5dab45 DB::StorageDistributedDirectoryMonitor::processFilesWithBatching(std::map<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) /usr/bin/clickhouse
5. 0x559e2a5db5fa DB::StorageDistributedDirectoryMonitor::processFiles() /usr/bin/clickhouse
6. 0x559e2a5dba78 DB::StorageDistributedDirectoryMonitor::run() /usr/bin/clickhouse
7. 0x559e2a5ddbbc ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::StorageDistributedDirectoryMonitor::*)(), DB::StorageDistributedDirectoryMonitor*>(void (DB::StorageDistributedDirectoryMonitor::*&&)(), DB::StorageDistributedDirectoryMonitor*&&)::{lambda()ClickHouse#1}::operator()() const /usr/bin/clickhouse
8. 0x559e2726b07c ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) /usr/bin/clickhouse
9. 0x559e2bbc3640 ? /usr/bin/clickhouse
10. 0x7fbd62b3cfb7 start_thread /lib/x86_64-linux-gnu/libpthread-2.29.so
11. 0x7fbd62a692ef __clone /lib/x86_64-linux-gnu/libc-2.29.so
(version 19.17.1.1)
v2: remove fsync, to avoid possible stalls (ClickHouse#7600 (comment))
f9f0edf to
4dfffdd
Compare
…ame)
Otherwise the following can happen after reboot:
2019.11.01 11:46:12.217143 [ 187 ] {} <Error> dist.Distributed.DirectoryMonitor: Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected \n before: S\'^A\0^]\0\0<BE>4^A\0r<87>\0\0<A2><D7>^D^Y\0<F2>{^E<CD>\0\0Hy\0\0<F2>^_^C\0^_&\0\0<FF><D3>\0\0
<8D><91>\0\0<C0>9\0\0<C0><B0>^A\0^G<AA>\0\0<B5><FE>^A\0<BF><A7>^A\0<9B><CB>^A\0I^R^A\0<B7><AB>^A\0<BC><8F>\0\0˲^B\0Zy\0\0<94><AA>\0\0<98>
<8F>\0\0\f<A5>\0\0^QN\0\0<E3><C6>\0\0<B1>6^B\0ɳ\0\0W<99>\0\0<B9><A2>\0\0:<BB>\0\0)<B1>\0\0#<8B>\0\0aW\0\0<ED>#\0\0<F1>@\0\0ˀ^B\0<D7><FC>\0\0<DF>, Stack trace:
0. 0x559e27222e60 StackTrace::StackTrace() /usr/bin/clickhouse
1. 0x559e27222c45 DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) /usr/bin/clickhouse
2. 0x559e26de4473 ? /usr/bin/clickhouse
3. 0x559e272494b5 DB::assertString(char const*, DB::ReadBuffer&) /usr/bin/clickhouse
4. 0x559e2a5dab45 DB::StorageDistributedDirectoryMonitor::processFilesWithBatching(std::map<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) /usr/bin/clickhouse
5. 0x559e2a5db5fa DB::StorageDistributedDirectoryMonitor::processFiles() /usr/bin/clickhouse
6. 0x559e2a5dba78 DB::StorageDistributedDirectoryMonitor::run() /usr/bin/clickhouse
7. 0x559e2a5ddbbc ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::StorageDistributedDirectoryMonitor::*)(), DB::StorageDistributedDirectoryMonitor*>(void (DB::StorageDistributedDirectoryMonitor::*&&)(), DB::StorageDistributedDirectoryMonitor*&&)::{lambda()#1}::operator()() const /usr/bin/clickhouse
8. 0x559e2726b07c ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) /usr/bin/clickhouse
9. 0x559e2bbc3640 ? /usr/bin/clickhouse
10. 0x7fbd62b3cfb7 start_thread /lib/x86_64-linux-gnu/libpthread-2.29.so
11. 0x7fbd62a692ef __clone /lib/x86_64-linux-gnu/libc-2.29.so
(version 19.17.1.1)
v2: remove fsync, to avoid possible stalls (#7600 (comment))
…ame)
Otherwise the following can happen after reboot:
2019.11.01 11:46:12.217143 [ 187 ] {} <Error> dist.Distributed.DirectoryMonitor: Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected \n before: S\'^A\0^]\0\0<BE>4^A\0r<87>\0\0<A2><D7>^D^Y\0<F2>{^E<CD>\0\0Hy\0\0<F2>^_^C\0^_&\0\0<FF><D3>\0\0
<8D><91>\0\0<C0>9\0\0<C0><B0>^A\0^G<AA>\0\0<B5><FE>^A\0<BF><A7>^A\0<9B><CB>^A\0I^R^A\0<B7><AB>^A\0<BC><8F>\0\0˲^B\0Zy\0\0<94><AA>\0\0<98>
<8F>\0\0\f<A5>\0\0^QN\0\0<E3><C6>\0\0<B1>6^B\0ɳ\0\0W<99>\0\0<B9><A2>\0\0:<BB>\0\0)<B1>\0\0#<8B>\0\0aW\0\0<ED>#\0\0<F1>@\0\0ˀ^B\0<D7><FC>\0\0<DF>, Stack trace:
0. 0x559e27222e60 StackTrace::StackTrace() /usr/bin/clickhouse
1. 0x559e27222c45 DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int) /usr/bin/clickhouse
2. 0x559e26de4473 ? /usr/bin/clickhouse
3. 0x559e272494b5 DB::assertString(char const*, DB::ReadBuffer&) /usr/bin/clickhouse
4. 0x559e2a5dab45 DB::StorageDistributedDirectoryMonitor::processFilesWithBatching(std::map<unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) /usr/bin/clickhouse
5. 0x559e2a5db5fa DB::StorageDistributedDirectoryMonitor::processFiles() /usr/bin/clickhouse
6. 0x559e2a5dba78 DB::StorageDistributedDirectoryMonitor::run() /usr/bin/clickhouse
7. 0x559e2a5ddbbc ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::StorageDistributedDirectoryMonitor::*)(), DB::StorageDistributedDirectoryMonitor*>(void (DB::StorageDistributedDirectoryMonitor::*&&)(), DB::StorageDistributedDirectoryMonitor*&&)::{lambda()#1}::operator()() const /usr/bin/clickhouse
8. 0x559e2726b07c ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>) /usr/bin/clickhouse
9. 0x559e2bbc3640 ? /usr/bin/clickhouse
10. 0x7fbd62b3cfb7 start_thread /lib/x86_64-linux-gnu/libpthread-2.29.so
11. 0x7fbd62a692ef __clone /lib/x86_64-linux-gnu/libc-2.29.so
(version 19.17.1.1)
v2: remove fsync, to avoid possible stalls (#7600 (comment))
…-corruption Write current batch for distributed send atomically (using .tmp + rename) (cherry picked from commit 1bfade5)
Category (leave one):
Short description (up to few sentences):
Write current batch for distributed send atomically (using .tmp + rename)
Detailed description (optional):
Otherwise the following can happen after reboot: