Dataframe v2: new and improved chunk tools#7649
Conversation
| /// WARNING: the returned chunk has the same old [`crate::ChunkId`]! Change it with [`Self::with_id`]. | ||
| #[must_use] | ||
| #[inline] | ||
| pub fn components_removed(self) -> Self { |
There was a problem hiding this comment.
chunk.without_components() reads more intuitively to me but I don't feel strongly
There was a problem hiding this comment.
I'm trying (hard) to keep to the seemingly de-facto arrow standard of using past participles (I think that's what they're called?) for methods that take ownership, filter and return a new one.
|
|
||
| /// Applies a [take] kernel to the [`Chunk`] as a whole. | ||
| /// | ||
| /// In release builds, indices are allowed to have null entries (they will be taken as `null`s). |
There was a problem hiding this comment.
What are the situations that cause us to query with null indices? Seems like returning a ChunkResult here and always making that an error condition would be preferable.
There was a problem hiding this comment.
We don't, but this is technically part of the public Rust API, so I don't want to punish end users trying to do something that is perfectly valid and apparently well accepted in the broader ecosystem (whether its panics or results, they're both extremely annoying in these filter chains).
Support clear semantics in the dataframe API. Tombstones are never visible to end-users, only their effect. Like every other Dataframe v2 feature PR, and following recommendations from @jleibs, this prioritizes convenience of implementation over everything else, for now. All clear chunks are fetched, post-processed, and re-injected into the view contents during init(), and then the streaming join runs as usual after that. Static clear semantics can get pretty unhinged, but that's A) not specific to the dataframe API and B) so extremely niche that our time is better spent on real-world problems right now: - #7650 - #7631 --- - Fixes #7495 - Fixes #7414 - Fixes #7468 - Fixes #7493 - DNM: requires #7649
Bunch of improvements and/or additions to the Chunk toolbox that happened as part of the implementation of the dataframe v2 API.
Checklist
mainbuild: rerun.io/viewernightlybuild: rerun.io/viewerCHANGELOG.mdand the migration guideTo run all checks from
main, comment on the PR with@rerun-bot full-check.