-
Notifications
You must be signed in to change notification settings - Fork 8k
perf: switch *_log tables to Memory engine (attempt to reduce cache misses) #31063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Another option is |
…isses) trace_log/query_log from performance tests shows (for cases when prewarm query fails with timeout, 15sec) excessive writeTraceInfo() in trace_log and QueryProfilerRuns in query_log, but this is not the root cause of the timeout, but consequence. Also query_log shows that on failures the following profile events has significantly higher values: - PerfLocalMemoryMisses (6.3x more) - PerfLocalMemoryReferences (7x more) - PerfDataTLBMisses (6.9x more) - PerfInstructionTLBMisses (6.4x more) During looking at performance tests logs I noticed that once the prewarm query fails other server (left/right) was merging (MergeTree) something in *_log tables. But, using MergeTree for *_log in performance tests is useless, since anyway environment for performance tests uses ramdrive. And so MergeTree merges just increase overhead. Eventually I expect that this should decrease extra memory referencing and so this should decrease cache/TLB misses. CI: https://clickhouse-test-reports.s3.yandex.net/30886/c504e0c08df7a926bb479a1d297f326f5c48a32f/performance_comparison/report.html#fail1 v2: <partition_by remove="remove"/>
|
@mergify update (an attempt to run perf tests on Intel Xeon Gold CPU) |
✅ Branch has been successfully updatedHey, I reacted but my real name is @Mergifyio |
|
@mergify update (an attempt to run perf tests on Intel Xeon Gold CPU) |
✅ Branch has been successfully updatedHey, I reacted but my real name is @Mergifyio |
|
@mergify update (an attempt to run perf tests on Intel Xeon Gold CPU) |
✅ Branch has been successfully updatedHey, I reacted but my real name is @Mergifyio |
|
@mergify update (an attempt to run perf tests on Intel Xeon Gold CPU) |
✅ Branch has been successfully updatedHey, I reacted but my real name is @Mergifyio |
|
Here are distribution of average
So the problem for prewarm queries (with profiler) is only with Gold CPU, but w/o profiler Gold CPU is faster. And even though the patch does not changes anything, it still worth applying I guess, but the description should be changed. |
trace_log/query_log from performance tests shows (for
cases when prewarm query fails with timeout, 15sec) excessive
writeTraceInfo() in trace_log and QueryProfilerRuns in query_log, but
this is not the root cause of the timeout, but consequence.
Also query_log shows that on failures the following profile events has
significantly higher values:
During looking at performance tests logs I noticed that once the prewarm
query fails other server (left/right) was merging (MergeTree) something
in *_log tables.
But, using MergeTree for *_log in performance tests is useless, since
anyway environment for performance tests uses ramdrive.
And so MergeTree merges just increase overhead.
Eventually I expect that this should decrease extra memory referencing
and so this should decrease cache/TLB misses.
CI: https://clickhouse-test-reports.s3.yandex.net/30886/c504e0c08df7a926bb479a1d297f326f5c48a32f/performance_comparison/report.html#fail1
Changelog category (leave one):
P.S. Marked as draft for now since I want to look at profile events for queries.
Cc: @alexey-milovidov