-
Notifications
You must be signed in to change notification settings - Fork 4k
Simplify arrow utility by renaming to more descriptive names #10123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * NOTE: ArrowJS automatically formats the columns in schema, i.e. we always get strings. | ||
| */ | ||
| export type Columns = string[][] | ||
| export type ColumnNames = string[][] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are about to update the comments: maybe this comment could be updated with a quick info why this is a multidimensional array. I suppose it is for nested column names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, to support multi-level column headers. Added it to the comment 👍
|
|
||
| /** DataFrame index and data types. */ | ||
| export interface Types { | ||
| /** DataFrame's index and data column types. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super-nit: /** A DataFrame's ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed 👍
| /** DataFrame index and data types. */ | ||
| export interface Types { | ||
| /** DataFrame's index and data column types. */ | ||
| export interface ColumnTypes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this interface seems to only support Pandas types, should it maybe be called PandasColumnTypes similar to the other renamings? And will this be generalized more in a follow-up step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed 👍
And will this be generalized more in a follow-up step?
Yep, there is still quite a bit of reliance on pandas metadata to remove to support purely raw arrow data. That will come with follow-up PRs. Goal is that the Pandas metadata is only used for additional accuracy, but the main information should come from the Arrow metadata. Unfortunately, the reliance on pandas' metadata was even more extreme than I had initially thought.
raethlein
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for making the names clearer!
…it#10123) ## Describe your changes This PR simplifies the arrow handling logic by renaming variables and functions to more descriptive names. This also adds a variety of code comments to make it easier to understand the logic. To simplify reviewing, this PR does not apply any logical changes. ## Testing Plan - No logical or visual changes - no new tests required. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
Describe your changes
This PR simplifies the arrow handling logic by renaming variables and functions to more descriptive names. This also adds a variety of code comments to make it easier to understand the logic.
To simplify reviewing, this PR does not apply any logical changes.
Testing Plan
Contribution License Agreement
By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.