-
Notifications
You must be signed in to change notification settings - Fork 196
Parallel DISTINCT plan of multi-stage. #1173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
pre discussion: #914 |
|
we can import some design from |
Hi, I wasn't aware that Postgres supports parallel DISTINCT functionality, although it hasn't been cherry-picked into CBDB yet. Thanks for pointing it out. I reviewed the code, and overall, it aligns with my design approach. Postgres uses a Gather node for a two-phase aggregation, performing the first phase on worker nodes. However, in CBDB, we don't have a Gather node; instead, we place the two-phase aggregation at the top-level node in a distributed setting. Additionally, we must consider data distribution, which requires the introduction of Motion nodes. Even if we were to incorporate Postgres's code, it would still necessitate distributed adaptation, though the plan's outcome would remain consistent with our current implementation. While I'm pleased to see similarities between our designs, I believe there's no immediate need to integrate this part of the code. |
my-ship-it
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Postgres UPSTREAM does not support parallel DISTINCT processing since
DISTINCT across multiple workers cannot be guaranteed. In MPP databases,
however, we can utilize Motion to redistribute tuples across multiple
workers within a parallel query.
For a DISTINCT query like:
select distinct a from t_distinct_0;
we can create a parallel plan based on the underlying node's Parallel
Scan on the table. The tuples are distributed randomly after the
Parallel Scan, even when the distribution key matches the target
expression.
The pre-distinct node uses Streaming HashAggregate or HashAggregate to
deduplicate some tuples in parallel, which are then redistributed
according to the DISTINCT expressions. Finally, a second-stage process
handles the DISTINCT operation.
QUERY PLAN
------------------------------------------------------------
Gather Motion 6:1 (slice1; segments: 6)
-> HashAggregate
Group Key: a
-> Redistribute Motion 6:6 (slice2; segments: 6)
Hash Key: a
Hash Module: 3
-> Streaming HashAggregate
Group Key: a
-> Parallel Seq Scan on t_distinct_0
Optimizer: Postgres query optimizer
(10 rows)
Parallel Group Aggregation is also supported:
explain(costs off)
select distinct a, b from t_distinct_0;
QUERY PLAN
-----------------------------------------------------------
GroupAggregate
Group Key: a, b
-> Gather Motion 6:1 (slice1; segments: 6)
Merge Key: a, b
-> GroupAggregate
Group Key: a, b
-> Sort
Sort Key: a, b
-> Parallel Seq Scan on t_distinct_0
Optimizer: Postgres query optimizer
(10 rows)
Authored-by: Zhang Mingli [email protected]
Postgres UPSTREAM does not support parallel DISTINCT processing since DISTINCT across multiple workers cannot be guaranteed. In MPP databases, however, we can utilize Motion to redistribute tuples across multiple workers within a parallel query.
For a DISTINCT query like:
we can create a parallel plan based on the underlying node's Parallel Scan on the table. The tuples are distributed randomly after the Parallel Scan, even when the distribution key matches the target expression.
The pre-distinct node uses Streaming HashAggregate or HashAggregate to deduplicate some tuples in parallel, which are then redistributed according to the DISTINCT expressions. Finally, a second-stage process handles the DISTINCT operation.
Parallel Group Aggregation is also supported:
performance
For DISTINCT, the performance in parallel processing is closely tied to the tuples among each worker process, particularly in the case of streaming hash aggregates.
Overall, the test cases indicate that a parallel execution plan with two workers achieves nearly half the runtime of a non-parallel plan.
The parallel plans(hashagg, streaming-hashagg, hashagg-groupagg) have 2 workers.
non-parallel
hashagg
streaming-hashagg
hashagg-groupagg
Authored-by: Zhang Mingli [email protected]
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions