Multithreading after window functions#50771
Conversation
…s to allow parallel stream processing
…ms-window-function
|
This is an automated comment for commit ac07b34 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
…ms-window-function' into feat-preserve-num-streams-window-function
|
also would be good to add perf tests |
|
01568_window_functions_distributed - I suppose we need similar logic to what we have for aggregation, when remote nodes pass buckets to initiator in order through a single stream. |
…between WindowTransforms (and WindowTransform works on single stream anyway).
|
@frinkr a few tests left to fix. take a look pls |
…s to allow parallel stream processing
…between WindowTransforms (and WindowTransform works on single stream anyway).
…m:frinkr/ClickHouse into feat-preserve-num-streams-window-function
|
@nickitat I didn't see how the performance failures (zeros_mt, hasAll, etc.) related to the PR. |
|
|
||
| void WindowStep::transformPipeline(QueryPipelineBuilder & pipeline, const BuildQueryPipelineSettings &) | ||
| { | ||
| auto num_streams = pipeline.getNumThreads(); |
There was a problem hiding this comment.
getNumStreams() return 1 if there is a ORDER BY in the window,
There was a problem hiding this comment.
so, we don't preserve the number of stream, but rather always fan out after the WindowStep
There was a problem hiding this comment.
then I think we should name the setting accordingly
|
pls create a perf test with the queries you've provided in the description |
|
@nickitat a new perf test 'window_functions_downstream_multithreading' was added, but the CI reports run errors. How do I know the error details? I run the perf with command |
|
LGTM. @nickitat How is it going on? |
|
apart from the setting name lgtm |
|
How about |
I'm ok with |
…y_plan_enable_multithreading_after_window_functions
|
@nickitat the setting was renamed. |
|
Ok, let's fix clang-tidy: and it will be ready for merging. |
|
@alexey-milovidov the clang-tidy was fixed. |
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20231028) * fix build due to ClickHouse/ClickHouse#50771 --------- Co-authored-by: kyligence-git <[email protected]> Co-authored-by: Chang Chen <[email protected]>
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add option
query_plan_preserve_num_streams_after_window_functionsto preserve the number of streams after evaluating window functions to allow parallel stream processing.Detail
The
WindowTransformneeds to merge streams into one, but the new option will split the result fromWindowTransformto multiple streams, which enables the parallel processing of down-streams functions/aggregates.Benchmark
Test data
Test queries
Test Env
Test Result