Avoid creating new RecordBatches to simplify expressions#20534
Avoid creating new RecordBatches to simplify expressions#20534alamb wants to merge 1 commit intoapache:mainfrom
Conversation
| let col = new_null_array(&DataType::Null, 1); | ||
| Ok(RecordBatch::try_new(dummy_schema, vec![col])?) | ||
| pub(crate) fn create_dummy_batch() -> Result<&'static RecordBatch> { | ||
| static DUMMY_BATCH: std::sync::OnceLock<Result<RecordBatch>> = |
There was a problem hiding this comment.
The idea is to compute the result once and then reuse it each time rather than create a new batch and array each time
|
I don't really expect to be able to measure the improvement here but I'll run benchmarks to check |
|
run benchmark sql_planner |
AdamGS
left a comment
There was a problem hiding this comment.
LGTM :) I think I had this at some point, but I couldn't measure a difference, even though it's obviously strictly less work
|
🤖 |
|
🤖: Benchmark completed Details
|
Yeah, I agree I will be surprised if this shows up in the traces. The first benchmark run seems to show improvements but I suspect it is just noise. I will rerun to see if I can reproduce |
|
run benchmark sql_planner |
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
simplify_const_expr#20234Rationale for this change
While reviewing #20234 from @AdamGS I wondered why we were creating new
RecordBatches to simplify expressions. I looked into it a bit and I think we can avoid a bunch of small allocations / deallocationsWhat changes are included in this PR?
Are these changes tested?
Yes by CI. I will also run benchmarks on it
Are there any user-facing changes?
No this is entirely internal