-
Notifications
You must be signed in to change notification settings - Fork 707
Dataframe: track and exclude fully empty columns #7615
Copy link
Copy link
Closed
Labels
feat-dataframe-apiEverything related to the dataframe APIEverything related to the dataframe API
Description
Context:
Transform3D Archetype logs empty arrays for every component in the Transform3D. Because the of combinatorial nature of Transform3D, for most users several of these columns are empty across all rows. This has both performance implications (see the need for tracking empty transform components here: #7300) .
Proposal:
- As chunks are inserted into the store, track whether that column contains any data other than NULL or [].
- Add a field to
ComponentColumnDescriptorsuch asis_empty, indicating this non-empty information. This is similar context tois_staticthat can help users decide whether they want to include a Column in their selection.- This means if you ask for the full schema of the recording, you can see which columns are empty.
- Add a new QueryExpression param
include_empty_columns.- If this is set to
false, any column which is fully empty will be treated as if it doesn't exist. - If you ask for the schema of the VIEW you will not see the empty columns unless this is true.
- This means these empty columns will not participate in row generation, which should be fine in 99.9% of real use-cases.
- If this is set to
Known edge-case:
- If you set:
include_empty_columns=False, but then you query for that column via Select you will get a FULLY NULL column. You will not see rows where that column was logged as empty.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
feat-dataframe-apiEverything related to the dataframe APIEverything related to the dataframe API