ARROW-16420: [Python] pq.write_to_dataset always ignores partitioning#13062
ARROW-16420: [Python] pq.write_to_dataset always ignores partitioning#13062AlenkaF wants to merge 3 commits intoapache:masterfrom
Conversation
|
|
|
Thanks! Is it possible to add tests for these? |
…error for existing_data_behavior check in the write_to_dataset
|
I added a test that checks for While writing the test I bumped into another error. If the |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Thanks, looks perfect!
I think it is fine to include the other changes here as well, as they are very similar
lidavidm
left a comment
There was a problem hiding this comment.
LGTM.
I wonder for some of these 'conflicting' options, should we raise an error? For instance if the user passes both 'partitioning' and 'partition_cols', or 'metadata_collector' and 'file_visitor'.
|
Yes, that makes sense. Will do. |
|
@AlenkaF do you can to do that here, or in a follow-up PR? (either way is fine) |
|
Sorry, am a bit distracted by other issues. |
|
Created a JIRA for the follow-up: |
|
Benchmark runs are scheduled for baseline = 1cdedc4 and contender = 0a0d7fe. 0a0d7fe is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Remove the lines that unconditionally set
partitioningandfile_visitorinpq.write_to_datasetto None. This is a leftover from #12811 where additionalpq.write_datasetkeywords were exposed.