Avoid unnecessary branching in row read/write if schema is null-free#1891
Avoid unnecessary branching in row read/write if schema is null-free#1891alamb merged 3 commits intoapache:masterfrom
Conversation
| &[] | ||
| } else { | ||
| let start = self.base_offset; | ||
| &self.data[start..start + self.null_width] |
There was a problem hiding this comment.
if null_width is always zero, I wonder if the check for self.null_free is needed?
There was a problem hiding this comment.
This is for not null_free code path. Actually this method shouldn't be touched when tuples are null-free
| use arrow::datatypes::{DataType, Schema}; | ||
| use arrow::record_batch::RecordBatch; | ||
| use arrow::util::bit_util::{ceil, round_upto_power_of_2, set_bit_raw, unset_bit_raw}; | ||
| #[cfg(feature = "jit")] |
There was a problem hiding this comment.
I think over time it would be good to start trying to encapsulate the JIT'd code more (as in reduce the number of #[cfg(feature = "jit")] calls -- perhaps by defining a common interface for creating jit and non jit versions. As I am interested in getting more involved in this project, I would be happy to try and do so (or do it as part of a larger body of work)
There was a problem hiding this comment.
Ah, that would be great! Thanks for the offering.
There was a problem hiding this comment.
I'll see what I can do over the next day or two
Which issue does this PR close?
Part of #1861
Rationale for this change
We can avoid null bit sets in the row representation and eliminate unnecessary branching during reading/writing, for both space and performance, when the row is null-free according to its schema.
Are there any user-facing changes?
No.