-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Basically the Unest exec plan could be made faster if we reduced some copies. Here is the basic idea in case anyone wants to do that
// Create an array with the unnested values of the list array, given the list
// array:
//
// [1], null, [2, 3, 4], null, [5, 6]
//
// the result array is:
//
// 1, null, 2, 3, 4, null, 5, 6
//
let unnested_array = unnest_array(list_array)?;
This looks very much the same to me as calling list_array.values() to get access to the underlying values: https://docs.rs/arrow/latest/arrow/array/struct.GenericListArray.html#method.values
In this case the values array would be more like
[1, 2, 3, 4, 5, 6]
And the offsets of the list array would be would be like (I think):
[0, 1, 1, 3, 3, 6]
With a null mask showing the second and fourth element are null
So I was thinking you could calculate the take indices directly from the offsets / nulls without having to copy all the values out of the underlying array
Originally posted by @alamb in #6903 (comment)