ARROW-16562: Avoid slicing array inputs in ExecBatchIterator that would result in one slice#13138
ARROW-16562: Avoid slicing array inputs in ExecBatchIterator that would result in one slice#13138zagto wants to merge 3 commits intoapache:masterfrom
Conversation
|
|
westonpace
left a comment
There was a problem hiding this comment.
Looks like a straightforward improvement. Is this meant to be in a draft state?
| if (args_[i].kind() == Datum::CHUNKED_ARRAY) { | ||
| const ChunkedArray& carr = *args_[i].chunked_array(); | ||
| batch->values[i] = Datum(carr.chunk(chunk_indexes_[i])->data()); | ||
| chunk_positions_[i] += iteration_size; | ||
| } else { | ||
| batch->values[i] = std::move(args_[i]); | ||
| } |
There was a problem hiding this comment.
| if (args_[i].kind() == Datum::CHUNKED_ARRAY) { | |
| const ChunkedArray& carr = *args_[i].chunked_array(); | |
| batch->values[i] = Datum(carr.chunk(chunk_indexes_[i])->data()); | |
| chunk_positions_[i] += iteration_size; | |
| } else { | |
| batch->values[i] = std::move(args_[i]); | |
| } | |
| if (args_[i].kind() == Datum::CHUNKED_ARRAY) { | |
| chunk_positions_[i] += iteration_size; | |
| } | |
| batch->values[i] = std::move(args_[i]); |
I'm not entirely certain this is correct but if iteration_size == length_ I think that means you are guaranteed that any chunked arrays are a single chunk (or at least, there is only one non-zero size chunk) and so you are consuming it all at once. I'm not even sure you need to update chunk_positions_[i].
There was a problem hiding this comment.
I think you're right about not having to update chunk_positions_[i]. But the suggested change wouldn't work. Moving the chunked array Datum directly into batch->values means it ends up getting passed to the kernel, which usually expects an ARRAY type Datum. The important part why we need a branch here is extracting the ArrayData from the chunked array
|
I suggest abandoning this PR since I'm going to remove ExecBatchIterator soon after #13364 is merged |
Makes sense, I'll close it |
No description provided.