Revert "fix: create file for empty stream"#16682
Conversation
alamb
left a comment
There was a problem hiding this comment.
Thanks @brunal
Can you please also add a test showing the problem you encountered (aka a test for writing an empty record batch stream to a parquet file) so we don't accidentally reintroduce the same problem?
cc @chenkovsky
|
Looks like there may be a test in |
a4dc97c to
f5a77b1
Compare
|
Indeed! I have just updated the revert commit with this test. |
|
The CI tests seem to have some problems unfortunately |
f5a77b1 to
f93a15c
Compare
This reverts commit 4084894. It adds a test showcasing the functionality that the commit above broke: writing a parquet file from an empty RecordBatch.
f93a15c to
ab5e7a3
Compare
|
Apologies for that; it's now fixed. |
alamb
left a comment
There was a problem hiding this comment.
Thank you @brunal
I pushed a commit to also verify the schema of the batch that is read from the empty parquet file
@chenkovsky can you help me review this PR and maybe help find a fix that works with csv and other formats?
Thank you @brunal and @chenkovsky |
Reverts #16342
After that change, one cannot write an empty RecordBatch with a schema to parquet anymore.
Indeed, the logic added tries to write an empty recordbatch when num_rows == 0.
So if you tried to write an empty recordbatch, you end up writing it twice => it fails.