ARROW-4464: [Rust] [DataFusion] Add support for LIMIT#3669
ARROW-4464: [Rust] [DataFusion] Add support for LIMIT#3669ntrinquier wants to merge 6 commits intoapache:masterfrom ntrinquier:arrow-4464
Conversation
andygrove
left a comment
There was a problem hiding this comment.
This is looking good so far ... I've left some feedback that should help you out
|
|
||
| let input_schema = input_rel.as_ref().borrow().schema().clone(); | ||
|
|
||
| let compiled_expr = compile_scalar_expr(&self, expr, &input_schema)?; |
There was a problem hiding this comment.
The LIMIT expression will be a literal e.g. LIMIT 10 so I don't think we need to compile and evaluate the expression but just use pattern matching to extract the number.
match expr {
&Expr::Literal(ref value) => /* extract value */,
_ => /* error - only literals are supported */
}
``
| input: Rc<RefCell<Relation>>, | ||
| expr: RuntimeExpr, | ||
| schema: Arc<Schema>, | ||
| num_consumed_rows: i64, |
| input: Rc<RefCell<Relation>>, | ||
| expr: RuntimeExpr, | ||
| schema: Arc<Schema>, | ||
| num_consumed_rows: i64, |
There was a problem hiding this comment.
num_consumed_rows should be usize
| fn next(&mut self) -> Result<Option<RecordBatch>> { | ||
| match self.input.borrow_mut().next()? { | ||
| Some(batch) => { | ||
| match self.expr.get_func()(&batch)? |
There was a problem hiding this comment.
no need to evaluate the expression, use limit field from struct
| } | ||
|
|
||
| //TODO: move into Arrow array_ops | ||
| fn filter(a: &Array, num_rows_to_read: i64) -> Result<ArrayRef> { |
|
Thanks a lot @andygrove I extracted the limit and cast it with |
Codecov Report
@@ Coverage Diff @@
## master #3669 +/- ##
==========================================
- Coverage 87.85% 87.76% -0.09%
==========================================
Files 689 689
Lines 84012 84023 +11
Branches 1081 1081
==========================================
- Hits 73805 73745 -60
- Misses 10094 10163 +69
- Partials 113 115 +2
Continue to review full report at Codecov.
|
| .collect(); | ||
|
|
||
| let limited_batch: RecordBatch = | ||
| RecordBatch::new(Arc::new(Schema::empty()), limited_columns?); |
There was a problem hiding this comment.
The new batch needs the same schema as the original, not an empty schema
There was a problem hiding this comment.
Once this is fixed, I'm ready to approve. Thanks @ntrinquier !
There was a problem hiding this comment.
Indeed that makes more sense. filter is also creating the new batch with an empty schema, so I guess this would have to be changed there too?
There was a problem hiding this comment.
oops, good catch, yes I'll file a bug for that
There was a problem hiding this comment.
No description provided.