refactor EliminateDuplicatedExpr optimizer pass to avoid clone#10218
refactor EliminateDuplicatedExpr optimizer pass to avoid clone#10218alamb merged 5 commits intoapache:mainfrom
EliminateDuplicatedExpr optimizer pass to avoid clone#10218Conversation
|
|
||
| // use this structure to avoid initial clone | ||
| #[derive(Eq, Clone, Debug)] | ||
| struct SortExprWrapper { |
There was a problem hiding this comment.
Wrap the Expr in a Wrapper to support specialized comparison
| .iter() | ||
| .map(|e| match e { | ||
| Expr::Sort(ExprSort { expr, .. }) => { | ||
| Expr::Sort(ExprSort::new(expr.clone(), true, false)) |
There was a problem hiding this comment.
avoid the normalized clone here
| Ok(None) | ||
| } else { | ||
| Ok(Some(LogicalPlan::Sort(Sort { | ||
| expr: dedup_expr.into_iter().cloned().collect::<Vec<_>>(), |
There was a problem hiding this comment.
avoid another clone here
| input: sort.input.clone(), | ||
| fetch: sort.fetch, | ||
| }))) | ||
| let mut index_set = IndexSet::new(); // use index_set instead of Hashset to preserve order |
There was a problem hiding this comment.
use index_set to preserve the original order of sort
There was a problem hiding this comment.
This is also quite clever
I think you can avoid a Vec here if you skip normalized_sort_keys and create unique_exprs directly, as you did below.
Something like
let unique_exprs: Vec<Expr> = sort
.expr
.into_iter()
// use SortExpr wrapper to ignore sort options
.map(|e| SortExprWrapper { expr: e })
.collect::<IndexSet<_>>()
.into_iter()
.map(|wrapper| wrapper.expr)
.collect();3c31bec to
09cd01d
Compare
alamb
left a comment
There was a problem hiding this comment.
Thank you @Lordworms -- this is really nice and quite clever
I have a few suggestions on how to make this PR better but I also think we could do it as a follow on too.
| } | ||
| } | ||
|
|
||
| // use this structure to avoid initial clone |
There was a problem hiding this comment.
| // use this structure to avoid initial clone | |
| /// Wrap the Expr in a Wrapper to support specialized comparison. | |
| /// | |
| /// Ignores the sort options for `SortExpr` because if the expression is the same | |
| /// the subsequent exprs are never matched | |
| /// | |
| /// For example, `ORDER BY a ASC a DESC` is the same | |
| // as `ORDER BY a ASC` (the second `a DESC` is never compared) |
|
|
||
| // use this structure to avoid initial clone | ||
| #[derive(Eq, Clone, Debug)] | ||
| struct SortExprWrapper { |
| input: sort.input.clone(), | ||
| fetch: sort.fetch, | ||
| }))) | ||
| let mut index_set = IndexSet::new(); // use index_set instead of Hashset to preserve order |
There was a problem hiding this comment.
This is also quite clever
I think you can avoid a Vec here if you skip normalized_sort_keys and create unique_exprs directly, as you did below.
Something like
let unique_exprs: Vec<Expr> = sort
.expr
.into_iter()
// use SortExpr wrapper to ignore sort options
.map(|e| SortExprWrapper { expr: e })
.collect::<IndexSet<_>>()
.into_iter()
.map(|wrapper| wrapper.expr)
.collect();EliminateDuplicatedExpr optimizer pass to avoid clone
|
Thanks again @Lordworms -- this looks great |
Which issue does this PR close?
part of #9637
Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?