-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
parquetChanges to the parquet crateChanges to the parquet crate
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
related to #1775
When i implement page index skipping #1792 , i found
/// The ChunkReader trait generates readers of chunks of a source.
/// For a file system reader, each chunk might contain a clone of File bounded on a given range.
/// For an object store reader, each read can be mapped to a range request.
pub trait ChunkReader: Length + Send + Sync {
type T: Read + Send;
/// get a serialy readeable slice of the current reader
/// This should fail if the slice exceeds the current bounds
fn get_read(&self, start: u64, length: usize) -> Result<Self::T>;
}
it assume read whole column chunk bytes array, but when facing like
* rows col1 col2 col3
* ┌──────┬──────┬──────┐
* 0 │ p0 │ │ │
* ╞══════╡ p0 │ p0 │
* 20 │ p1(X)│------│------│
* ╞══════╪══════╡ │
* 40 │ p2 │ │------│
* ╞══════╡ p1(X)╞══════╡
* 60 │ p3(X)│ │------│
* ╞══════╪══════╡ │
* 80 │ p4 │ │ p1 │
* ╞══════╡ p2 │ │
* 100 │ p5 │ │ │
* └──────┴──────┴──────┘
read col1 page1 and page3 we need skip other pages
we should pass two offsets
Describe the solution you'd like
pass multi strart and length
fn get_read(&self, start: vec<u64>, length: vec<usize>) -> Result<Self::T>;
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
parquetChanges to the parquet crateChanges to the parquet crate