-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Improve documentation on writing parquet, including multiple threads #7321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I have a few more things to write but ran out of time today |
6a8012a to
9026cc0
Compare
9026cc0 to
1e5e072
Compare
|
@etseidl I wonder if I could trouble you for a review of this PR? |
etseidl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nice improvement, thanks @alamb.
parquet/src/lib.rs
Outdated
| //! to leverage the wide range of data transforms provided by the [arrow] crate, the | ||
| //! ecosystem of [Arrow] compatible systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why you dropped "and by" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was exercising my inner copy editor and trying to reduce the number of words being used to express this concept. I agree and by was a unintended casualty. I will put it back
Co-authored-by: Ed Seidl <[email protected]>
|
Thanks again @etseidl |
Which issue does this PR close?
Rationale for this change
I spent a while looking for how to write parquet data with multiple threads and I knew the functionality exists
Also I always get confused looking at the parquet documentation as there are similarly named structures and in different modules.
What changes are included in this PR?
Let's leave some more links in the docs to make it easier to APIs related to parallelism
Also, mention in the key structures crate level docs
Are there any user-facing changes?
Documentation only.
There are no functional changes