-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Crash with Kafka engine (format Protobuf) when sending unexpected messages #5957
Copy link
Copy link
Closed
Closed
Copy link
Labels
bugConfirmed user-visible misbehaviour in official releaseConfirmed user-visible misbehaviour in official releasecomp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).crashCrash / segfault / abortCrash / segfault / abortst-need-infoWe need extra data to continue (waiting for response). Either some details or a repro of the issue.We need extra data to continue (waiting for response). Either some details or a repro of the issue.
Description
I have a table configured with Kafka engine and settings kafka_format='Protobuf',kafka_schema='test:FlowDataRecord',kafka_skip_broken_messages=1
and a simple materialized view to insert these data in a storage table.
If I send an unexpected message, for example a Protobuf message for another type than FlowDataRecord, I sometimes get a CH crash.
I'm using official Docker image of 19.9.2.4.
It is very hard to produce a simple test case to reproduce and I can't send my real table structures, so I share the crash logs, I hope that it is enough to find the bug.
Case 1:
I send a good message then a bad message, there are processed together by Kafka engine:
2019.07.10 08:39:39.674082 [ 12 ] {} <Trace> StorageKafka (kafka_queue): Polled batch of 2 messages
2019.07.10 08:39:39.674949 [ 12 ] {} <Trace> StorageKafka (kafka_queue): Re-joining claimed consumer after failure
2019.07.10 08:39:39.759439 [ 12 ] {} <Error> void DB::StorageKafka::streamThread(): Code: 173, e.displayText() = DB::ErrnoException: Allocator: Cannot realloc from 64.00 B to 0.00 B., errno: 0, strerror: Success, Stack trace:
0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x7285206]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x39a84d2]
2. /usr/bin/clickhouse-server(DB::throwFromErrno(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int)+0x171) [0x726b271]
3. /usr/bin/clickhouse-server(AllocatorWithHint<false, AllocatorHints::DefaultHint, 67108864ul>::realloc(void*, unsigned long, unsigned long, unsigned long)+0x1a2) [0x3a03112]
4. /usr/bin/clickhouse-server() [0x6d952f4]
5. /usr/bin/clickhouse-server(DB::ProtobufReader::SimpleReader::readStringInto(DB::PODArray<unsigned char, 4096ul, AllocatorWithHint<false, AllocatorHints::DefaultHint, 67108864ul>, 15ul, 16ul>&)+0x8b) [0x6d953db]
6. /usr/bin/clickhouse-server(DB::DataTypeString::deserializeProtobuf(DB::IColumn&, DB::ProtobufReader&, bool, bool&) const+0x58) [0x66426a8]
7. /usr/bin/clickhouse-server(DB::ProtobufRowInputStream::read(std::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, DB::RowReadExtension&)+0x131) [0x6accf41]
8. /usr/bin/clickhouse-server(DB::BlockInputStreamFromRowInputStream::readImpl()+0x168) [0x6d76c98]
9. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
10. /usr/bin/clickhouse-server(DB::KafkaBlockInputStream::readImpl()+0x28) [0x72627a8]
11. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
12. /usr/bin/clickhouse-server(DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::atomic<bool>*)+0x6b) [0x65b004b]
13. /usr/bin/clickhouse-server(DB::StorageKafka::streamToViews()+0x5cd) [0x725ca9d]
14. /usr/bin/clickhouse-server(DB::StorageKafka::streamThread()+0x1ba) [0x725cfea]
15. /usr/bin/clickhouse-server(DB::BackgroundSchedulePoolTaskInfo::execute()+0xfa) [0x653946a]
16. /usr/bin/clickhouse-server(DB::BackgroundSchedulePool::threadFunction()+0x6a) [0x6539b4a]
17. /usr/bin/clickhouse-server() [0x6539bc9]
18. /usr/bin/clickhouse-server(ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)+0x1af) [0x728b43f]
19. /usr/bin/clickhouse-server() [0xb1a1c9f]
20. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fa21c1a26db]
21. /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fa21b72188f]
(version 19.9.2.4 (official build))
2019.07.10 08:39:39.759515 [ 12 ] {} <Trace> StorageKafka (kafka_queue): Execution took 587 ms.
2019.07.10 08:39:40.259919 [ 13 ] {} <Debug> StorageKafka (kafka_queue): Started streaming to 1 attached views
2019.07.10 08:39:45.762889 [ 13 ] {} <Trace> StorageKafka (kafka_queue): Polled batch of 0 messages
2019.07.10 08:39:45.763388 [ 46 ] {} <Error> BaseDaemon: ########################################
2019.07.10 08:39:45.763511 [ 46 ] {} <Error> BaseDaemon: (version 19.9.2.4 (official build)) (from thread 13) Received signal Segmentation fault (11).
2019.07.10 08:39:45.763545 [ 46 ] {} <Error> BaseDaemon: Address: 0x10201a1848d0
2019.07.10 08:39:45.763556 [ 46 ] {} <Error> BaseDaemon: Access: read.
2019.07.10 08:39:45.763566 [ 46 ] {} <Error> BaseDaemon: Address not mapped to object.
2019.07.10 08:39:45.821619 [ 46 ] {} <Error> BaseDaemon: 0. /usr/bin/clickhouse-server(std::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >::~vector()+0x3f) [0x39b5d8f]
2019.07.10 08:39:45.821657 [ 46 ] {} <Error> BaseDaemon: 1. /usr/bin/clickhouse-server() [0x3802109]
2019.07.10 08:39:45.821676 [ 46 ] {} <Error> BaseDaemon: 2. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
2019.07.10 08:39:45.821692 [ 46 ] {} <Error> BaseDaemon: 3. /usr/bin/clickhouse-server(DB::KafkaBlockInputStream::readImpl()+0x28) [0x72627a8]
2019.07.10 08:39:45.821700 [ 46 ] {} <Error> BaseDaemon: 4. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
2019.07.10 08:39:45.821712 [ 46 ] {} <Error> BaseDaemon: 5. /usr/bin/clickhouse-server(DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::atomic<bool>*)+0x6b) [0x65b004b]
2019.07.10 08:39:45.821719 [ 46 ] {} <Error> BaseDaemon: 6. /usr/bin/clickhouse-server(DB::StorageKafka::streamToViews()+0x5cd) [0x725ca9d]
2019.07.10 08:39:45.821726 [ 46 ] {} <Error> BaseDaemon: 7. /usr/bin/clickhouse-server(DB::StorageKafka::streamThread()+0x1ba) [0x725cfea]
2019.07.10 08:39:45.821737 [ 46 ] {} <Error> BaseDaemon: 8. /usr/bin/clickhouse-server(DB::BackgroundSchedulePoolTaskInfo::execute()+0xfa) [0x653946a]
2019.07.10 08:39:45.821743 [ 46 ] {} <Error> BaseDaemon: 9. /usr/bin/clickhouse-server(DB::BackgroundSchedulePool::threadFunction()+0x6a) [0x6539b4a]
2019.07.10 08:39:45.821749 [ 46 ] {} <Error> BaseDaemon: 10. /usr/bin/clickhouse-server() [0x6539bc9]
2019.07.10 08:39:45.821757 [ 46 ] {} <Error> BaseDaemon: 11. /usr/bin/clickhouse-server(ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)+0x1af) [0x728b43f]
2019.07.10 08:39:45.821763 [ 46 ] {} <Error> BaseDaemon: 12. /usr/bin/clickhouse-server() [0xb1a1c9f]
2019.07.10 08:39:45.821769 [ 46 ] {} <Error> BaseDaemon: 13. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fa21c1a26db]
Case 2:
I directly send a bad message:
2019.07.10 08:45:10.365823 [ 17 ] {} <Trace> StorageKafka (kafka_queue): Polled batch of 1 messages
2019.07.10 08:45:10.866202 [ 17 ] {} <Trace> StorageKafka (kafka_queue): Polled batch of 0 messages
2019.07.10 08:45:10.866708 [ 46 ] {} <Error> BaseDaemon: ########################################
2019.07.10 08:45:10.866813 [ 46 ] {} <Error> BaseDaemon: (version 19.9.2.4 (official build)) (from thread 17) Received signal Segmentation fault (11).
2019.07.10 08:45:10.866847 [ 46 ] {} <Error> BaseDaemon: Address: NULL pointer.
2019.07.10 08:45:10.866864 [ 46 ] {} <Error> BaseDaemon: Access: read.
2019.07.10 08:45:10.866878 [ 46 ] {} <Error> BaseDaemon: Unknown si_code.
2019.07.10 08:45:10.934565 [ 46 ] {} <Error> BaseDaemon: 0. /usr/bin/clickhouse-server(std::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >::~vector()+0x3f) [0x39b5d8f]
2019.07.10 08:45:10.934610 [ 46 ] {} <Error> BaseDaemon: 1. /usr/bin/clickhouse-server() [0x3802109]
2019.07.10 08:45:10.934621 [ 46 ] {} <Error> BaseDaemon: 2. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
2019.07.10 08:45:10.934633 [ 46 ] {} <Error> BaseDaemon: 3. /usr/bin/clickhouse-server(DB::KafkaBlockInputStream::readImpl()+0x28) [0x72627a8]
2019.07.10 08:45:10.934644 [ 46 ] {} <Error> BaseDaemon: 4. /usr/bin/clickhouse-server(DB::IBlockInputStream::read()+0x188) [0x6591b08]
2019.07.10 08:45:10.934658 [ 46 ] {} <Error> BaseDaemon: 5. /usr/bin/clickhouse-server(DB::copyData(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::atomic<bool>*)+0x6b) [0x65b004b]
2019.07.10 08:45:10.934668 [ 46 ] {} <Error> BaseDaemon: 6. /usr/bin/clickhouse-server(DB::StorageKafka::streamToViews()+0x5cd) [0x725ca9d]
2019.07.10 08:45:10.934677 [ 46 ] {} <Error> BaseDaemon: 7. /usr/bin/clickhouse-server(DB::StorageKafka::streamThread()+0x1ba) [0x725cfea]
2019.07.10 08:45:10.934687 [ 46 ] {} <Error> BaseDaemon: 8. /usr/bin/clickhouse-server(DB::BackgroundSchedulePoolTaskInfo::execute()+0xfa) [0x653946a]
2019.07.10 08:45:10.934696 [ 46 ] {} <Error> BaseDaemon: 9. /usr/bin/clickhouse-server(DB::BackgroundSchedulePool::threadFunction()+0x6a) [0x6539b4a]
2019.07.10 08:45:10.934705 [ 46 ] {} <Error> BaseDaemon: 10. /usr/bin/clickhouse-server() [0x6539bc9]
2019.07.10 08:45:10.934716 [ 46 ] {} <Error> BaseDaemon: 11. /usr/bin/clickhouse-server(ThreadPoolImpl<std::thread>::worker(std::_List_iterator<std::thread>)+0x1af) [0x728b43f]
2019.07.10 08:45:10.934725 [ 46 ] {} <Error> BaseDaemon: 12. /usr/bin/clickhouse-server() [0xb1a1c9f]
2019.07.10 08:45:10.934733 [ 46 ] {} <Error> BaseDaemon: 13. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f29742406db]
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugConfirmed user-visible misbehaviour in official releaseConfirmed user-visible misbehaviour in official releasecomp-formatsInput/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).Input/output formats (CSV/JSON/Parquet/ORC/Arrow/Protobuf/etc.).crashCrash / segfault / abortCrash / segfault / abortst-need-infoWe need extra data to continue (waiting for response). Either some details or a repro of the issue.We need extra data to continue (waiting for response). Either some details or a repro of the issue.