Split metadata tables into separate modules#872
Conversation
10a3d48 to
a6017ce
Compare
a6017ce to
e8cabed
Compare
e8cabed to
be77803
Compare
| /// - <https://iceberg.apache.org/docs/latest/spark-queries/#querying-with-sql> | ||
| /// - <https://py.iceberg.apache.org/api/#inspecting-tables> | ||
| #[derive(Debug)] | ||
| pub struct MetadataTable(Table); |
There was a problem hiding this comment.
I don't quite understand why we need this data struct here. It seems just a wrapper to provide more api, while just like this:
impl Table {
pub fn snapshots(&self) -> SnapshotsTable {
...
}
pub fn manifests(&self) -> ManifestsTable {
...
}
}
There was a problem hiding this comment.
I don't quite understand why we need this data struct here. It seems just a wrapper to provide more api, while just like this:
Yes, I intend to make the API exposed at Table more organized. For example, users will have:
table.metadata_table().snapshots();
table.metadata_table().manifests();instead of:
// Could be confused with `table.metadata().snapshots()` which returns `Snapshot`.
table.snapshots();
// Verbose and long
table.metadata_snapshots_table();
table.metadata_manifests_table();While I believe we could use better API names, such as table.inspect().snapshots(), the overall structure looks good to me and aligns better with other implementations.
There was a problem hiding this comment.
Sounds reasonable to me.
There was a problem hiding this comment.
The metadata_table to inspect rename is here: #881
| } | ||
|
|
||
| /// Returns the schema of the snapshots table. | ||
| pub fn schema(&self) -> Schema { |
There was a problem hiding this comment.
Same question as #868 . But we could defer this to later issue.
| } | ||
|
|
||
| /// Scans the snapshots table. | ||
| pub fn scan(&self) -> Result<RecordBatch> { |
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @rshkv for this pr, LGTM!
|
I'll merge this first as it's a pure refactoring. |
Split metadata tables into separate modules.
Context for this is to address #863 (comment) where the point was made that
metadata_scan.rswill grow unwieldy if we shove all metadata table implementations in there. Especially as we're going to add extra utilities for those metadata tables.The structure in this PR is:
In the future this can expand as described in #863 (comment).