Remove recursive const check in simplify_const_expr#20234
Conversation
|
(if any maintainer can add the |
3d7e56b to
0610595
Compare
I do agree with you but please also consider deprecating those functions for a release or two and then remove the |
|
Of course, will push that first thing tomorrow morning |
| .transform_data(|node| unwrap_cast_in_comparison(node, schema))? | ||
| .transform_data(|node| { | ||
| simplify_const_expr_with_dummy(node, &batch) | ||
| const_evaluator::simplify_const_expr_immediate(node, &batch) |
|
I'll plan to merge this PR in when it passes CI |
|
run benchmark sql_planner |
|
🤖 |
| /// that only contain literals, the batch content is irrelevant. | ||
| /// | ||
| /// This is the same approach used in the logical expression `ConstEvaluator`. | ||
| pub(crate) fn create_dummy_batch() -> Result<RecordBatch> { |
There was a problem hiding this comment.
While reviewing this I was wondering why we need a dummy batch at all -- I created a follow on PR to create it once and reuse it (should make expression simplification faster)
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
simplify_const_exprdoes unnecessary recursive work #20134 .Rationale for this change
The check for simplifying const expressions was recursive and expensive, repeatedly checking the expression's children in a recursive way.
I've tried other approached like pre-computing the result for all expressions outside of the loop and using that cache during the traversal, but I've found that it only yielded between 5-8% improvement while adding complexity, while this approach simplifies the code and seems to be more performant in my benchmarks (change is compared to current main branch):
What changes are included in this PR?
simplify_const_exprnow only checks itself and whether all of its children are literals, because it assumes the order of simplification is bottoms-up.Are these changes tested?
Existing test suite
Are there any user-facing changes?
I suggest removing some of the physical expression simplification code from the public API, which I believe reduces the maintenance burden here. These changes also helps removing code like the distinct
simplify_const_exprandsimplify_const_expr_with_dummy.datafusion-physical-expr::simplifiersub-modules (notandconst_evaluator) private, including their key functions. They are not used externally, and being able to change their behavior seems more valuable long term. The simplifier is also not currently an extension point as far as I can tell, so there's no value in providing atomic building blocks like them for now.has_column_referencescompletely, its trivial to re-implement and isn't used anywhere in the codebase.