fix: filter pushdown when merge filter#20110
Conversation
|
@haohuaijin please ping me when this is ready for review |
|
cc @adriangb, ready for reviews |
| let unsupported_parent_filters = | ||
| child_pushdown_result.parent_filters.iter().filter_map(|f| { | ||
| matches!(f.all(), PushedDown::No).then_some(Arc::clone(&f.filter)) | ||
| }); | ||
| let mut unsupported_parent_filters: Vec<Arc<dyn PhysicalExpr>> = | ||
| child_pushdown_result | ||
| .parent_filters | ||
| .iter() | ||
| .filter_map(|f| { | ||
| matches!(f.all(), PushedDown::No).then_some(Arc::clone(&f.filter)) | ||
| }) | ||
| .collect(); |
There was a problem hiding this comment.
Note: this is the same code, just added the mut which caused a reformat.
| // If this FilterExec has a projection, the unsupported parent filters | ||
| // are in the output schema (after projection) coordinates. We need to | ||
| // remap them to the input schema coordinates before combining with self filters. | ||
| if self.projection.is_some() { |
There was a problem hiding this comment.
👍🏻 note that this is because filters get evaluated before the projection
|
|
||
| /// related to https://github.com/apache/datafusion/issues/20109 | ||
| #[tokio::test] | ||
| async fn test_filter_with_projection_pushdown() { |
There was a problem hiding this comment.
Any chance you could add an SLT test that reproduces this? It could go in datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt
Co-authored-by: Adrian Garcia Badaracco <[email protected]>
|
Thanks for you reviews @adriangb, i try add a SLT test, but use sql is hard to do. |
|
No worries. Let's merge this once CI passes. |
|
🚀 |
|
Hi @alamb, Is it possible to release version 52.2.0? This issue has affected our online business, and we have temporarily disabled this feature. It would be great if a minor version could be released quickly. |
|
@hengfeiyang we can certainly put this on the docket for the next minor release, but we don’t tend to do more than 1 minor release per major version, so this may have to wait until the next major version if there aren’t many other fixes. Would you be able to point your system at a git commit or fork with this commit cherry picked? That’s what we do |
|
Yes. definitely we can do that, actually we did that in the past if we can't get a release soon. just we don't like to use fork version if we can have official release. Thank you for you reply. we will use fork version before next release. |
I am happy to help support another release (run the voting, etc) if someone else is willing do make the tickets, backport and changelog PRs |
|
(BTW I think I may also have a bug in 52 that I'll file tomorrow which might be good to fix too). I'll post here if I do |
## Which issue does this PR close? - Closes apache#20109 ## Rationale for this change see issue apache#20109 ## What changes are included in this PR? 1. Remap parent filter expressions: When a FilterExec has a projection, remap unsupported parent filter expressions from output schema coordinates to input schema coordinates using `reassign_expr_columns()` before combining them with the current filter's predicates. 2. Preserve projection: When creating the merged FilterExec, preserve the original projection instead of discarding it . ## Are these changes tested? yes, add some test case ## Are there any user-facing changes? --------- Co-authored-by: Adrian Garcia Badaracco <[email protected]>
Update: Filed a ticket to track a next patch release |
## Which issue does this PR close? - Closes apache#20109 ## Rationale for this change see issue apache#20109 ## What changes are included in this PR? 1. Remap parent filter expressions: When a FilterExec has a projection, remap unsupported parent filter expressions from output schema coordinates to input schema coordinates using `reassign_expr_columns()` before combining them with the current filter's predicates. 2. Preserve projection: When creating the merged FilterExec, preserve the original projection instead of discarding it . ## Are these changes tested? yes, add some test case ## Are there any user-facing changes? --------- Co-authored-by: Adrian Garcia Badaracco <[email protected]>
## Which issue does this PR close? - related to #20287 ## Rationale for this change see issue #20109 ## What changes are included in this PR? 1. Remap parent filter expressions: When a FilterExec has a projection, remap unsupported parent filter expressions from output schema coordinates to input schema coordinates using `reassign_expr_columns()` before combining them with the current filter's predicates. 2. Preserve projection: When creating the merged FilterExec, preserve the original projection instead of discarding it . ## Are these changes tested? yes, add some test case ## Are there any user-facing changes? --------- ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Adrian Garcia Badaracco <[email protected]>
Which issue does this PR close?
Rationale for this change
see issue #20109
What changes are included in this PR?
Remap parent filter expressions: When a FilterExec has a projection, remap unsupported parent filter expressions from output schema coordinates to input schema coordinates using
reassign_expr_columns()before combining them with the current filter's predicates.Preserve projection: When creating the merged FilterExec, preserve the original projection instead of discarding it .
Are these changes tested?
yes, add some test case
Are there any user-facing changes?