Improve page index metadata loading in `SerializedFileReader::new_with_options`

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
As @etseidl pointed out in https://github.com/apache/arrow-rs/pull/6466/files#r1778966728

> This should be a bit more efficient since read_page_indexes will fetch the necessary bytes from the file in a single read, rather than 2 reads per row group.

We can use the new ParquetMetaDataLoader API to read the page indexes in more efficiently (fewer IOs for example)

However, when I tried to implement it, we caught what appears to be a subtle bug -- specifically that the predicates would have been ignored: https://github.com/apache/arrow-rs/pull/6466/files#r1783526090 -- no tests failed. 

**Describe the solution you'd like**

I would like to: 
1. Reduce the IO's needed to read page indexes in `SerializedFileReader::new_with_options`, and clean up the code to use the new ParquetMetaDataReader
2. Add test coverage for reader predicates and page index

**Describe alternatives you've considered**
leave as is

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve page index metadata loading in `SerializedFileReader::new_with_options` #6491

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve page index metadata loading in SerializedFileReader::new_with_options #6491

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve page index metadata loading in `SerializedFileReader::new_with_options` #6491