Task #9: Implement ResolvePredicateFields function#20
Merged
Conversation
- Added PredicateField struct to hold resolved field information - Implemented ResolvePredicateFields() helper function - Resolves field references in predicates to ORC column indices - Uses OrcSchemaManifest for Arrow-to-ORC column mapping - Traverses nested field paths (structs only) - Filters to leaf nodes only (containers don't have statistics) - Type support check (currently int32/int64 only) - Returns vector of PredicateField entities Implementation details: - Uses compute::FieldsInExpression() to extract field refs - Uses FieldRef.FindOneOrNone() for schema matching - Traverses OrcSchemaField tree for nested paths - Validates field indices and struct types - PredicateField includes: field_ref, arrow_field_index, orc_column_index, data_type, supports_statistics Verified: Manual code review following Parquet TestRowGroups pattern (lines 945-960) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
cbb330
added a commit
that referenced
this pull request
Feb 20, 2026
cbb330
added a commit
that referenced
this pull request
Feb 20, 2026
Implemented comprehensive test suite for ORC predicate pushdown covering: - Equality predicates (=, !=) - Comparison predicates (<, <=, >, >=) - Compound predicates (AND, OR) - Special cases (literal true/false) - Out-of-bounds filters - Both int32 and int64 types Tests verify that FilterStripes correctly evaluates predicates against stripe statistics and skips irrelevant stripes. Each test validates the correct number of rows are returned after filtering. Uses OrcTestFileGenerator to create test files with controlled value ranges per stripe, enabling precise verification of stripe filtering behavior. Verified: All predicates tested against 5-stripe files where each stripe contains distinct value ranges ([0-99], [100-199], etc.) Co-authored-by: Claude Sonnet 4.5 <[email protected]>
cbb330
added a commit
that referenced
this pull request
Feb 20, 2026
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or See also: |
cbb330
added a commit
that referenced
this pull request
Feb 24, 2026
cbb330
added a commit
that referenced
this pull request
Feb 24, 2026
Implemented comprehensive test suite for ORC predicate pushdown covering: - Equality predicates (=, !=) - Comparison predicates (<, <=, >, >=) - Compound predicates (AND, OR) - Special cases (literal true/false) - Out-of-bounds filters - Both int32 and int64 types Tests verify that FilterStripes correctly evaluates predicates against stripe statistics and skips irrelevant stripes. Each test validates the correct number of rows are returned after filtering. Uses OrcTestFileGenerator to create test files with controlled value ranges per stripe, enabling precise verification of stripe filtering behavior. Verified: All predicates tested against 5-stripe files where each stripe contains distinct value ranges ([0-99], [100-199], etc.) Co-authored-by: Claude Sonnet 4.5 <[email protected]>
cbb330
added a commit
that referenced
this pull request
Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements field resolution for predicate pushdown, mapping Arrow fields to ORC columns.
Changes
Implementation Details
Testing
Manual code review following Parquet reference (lines 945-960)
Task Reference
Completes Task #9 from task_list.json
Depends on: Tasks #3, #7 (both complete)
Enables: Task #10 (DeriveFieldGuarantee)