Conversation
|
21dea30
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20250705) * Fix benchmark build * Fix Benchmark build due to ClickHouse/ClickHouse#79417 * Revert "Fix Build due to ClickHouse/ClickHouse#80931" This reverts commit 02d12f6. * Fix Build due to ClickHouse/ClickHouse#81886 * Fix Link issue due to ClickHouse/ClickHouse#83121 * Fix Build due to ClickHouse/ClickHouse#82604 * Fix Build due to ClickHouse/ClickHouse#82945 * Fix Build due to ClickHouse/ClickHouse#83214 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang chen <[email protected]>
| class AsyncLogMessageQueue | ||
| { | ||
| /// Maximum size of the queue, to prevent memory overflow | ||
| static constexpr size_t max_size = 10'000; |
There was a problem hiding this comment.
Hm, I think this can be too small (not small in terms of number of rows, but in terms of memory, I mean it is OK to use 100MiB for logs queue, while 10K likely around 10MiB?), maybe we should make this configurable?
Also since this can eat memory, let's add some metrics for the number of messages in the queue?
There was a problem hiding this comment.
Also we need an event for dropped messages due to queue overflow
There was a problem hiding this comment.
Yes, all 3 make sense.
- Make the limit configurable.
- Add a metric with number of pending messages in each queue (although naming might be hard, need to think about how to best handle it).
- Add a profile event for dropped messages, ideally also per queue.
I'll look into it.
Aside from that there is another improvement I didn't make because I didn't expect it to be necessary, which is that the flushing thread could flush multiple messages at once (up to a certain limit). So instead of doing push message, then flush, we could do push 20 messages if available, then flush the stream once. At least for console and disk it should be faster
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Async logs: Limit the max number of entries that are hold in the queue
Note that each channel has its own queue, so a message might be dropped in syslog or console (because they are the slowest ones) but still make it to text files or system.text_log. The number of 10k is arbitrary.
I replaced Poco's NotificationQueue for our own, simplified, queue. I also tested several no-locks queues with bad results. Either they were slower, had TSAN issues or did not support blocking on pop() which meant we'd need to busy wait. I also tried to replace the condition_variable with an std::atomic_bool. It appeared to be slightly faster in the producer , but it consumed way more CPU in the consumer so I ended up discarding it.
Documentation entry for user-facing changes