Support IGNORE NULLS for LEAD window function#9419
Conversation
| 3 | ||
| 24 | ||
| 14 | ||
| 78 50 |
There was a problem hiding this comment.
This only test is failing....
There was a problem hiding this comment.
@mustafasrepo setting batch size to extreme low value will affect the window range?
| # result should be same with above, when lag algorithm work with pruned data. | ||
| # decreasing batch size, causes data to be produced in smaller chunks at the source. | ||
| # Hence sliding window algorithm is used during calculations. | ||
| # LEAD function to be added, there is some bug with incoming data array when LEAD is called |
There was a problem hiding this comment.
I'll address this in following PR, what I found is the incoming data is incomplete for LEAD.
so for set datafusion.execution.batch_size = 8000;
the incoming array is complete
[datafusion/physical-expr/src/window/lead_lag.rs:280:17] &array = StringArray
[
"x",
null,
"b",
null,
]
and for set datafusion.execution.batch_size = 1;
the data is partially missed
[datafusion/physical-expr/src/window/lead_lag.rs:280:17] &array = StringArray
[
"x",
null,
"b"
]
Probably the error is in built_in.rs I'll check it in following PR
mustafasrepo
left a comment
There was a problem hiding this comment.
Thanks @comphead for this PR. It is LGTM!. Regarding failing test, you mentioned. The reason for its failure is that BoundedWindowExecutor uses get_range method to calculate required buffer for the result calculation. For most of the window function that uses window frame, get_range is not used. Because required range can be determined from window frame. However, for some builtin window functions window frame is not used such as LEAD, LAG, ROW_NUMBER. For these ones get_range should be implemented to make sure that required buffer is always present for the result. I think bug is related to the get_range implementation for the LEAD. I will file a subsequent PR to solve this problem.
* IGNORE NULLS support for LEAD * fix * fix * fix
Which issue does this PR close?
Closes #.
Related to #9055
Rationale for this change
Expanding IGNORE NULLS support for LEAD window function
What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?