Skip to content

Conversation

@hermanschaaf
Copy link
Contributor

@hermanschaaf hermanschaaf commented Apr 17, 2023

The way the CSV reader handles null values currently makes it impossible for empty strings to be interpreted as null, even when this is explicitly enabled through the csv.WithNullReader(stringsCanBeNull: true, "", "NULL", "null") option. This change fixes this.

@hermanschaaf hermanschaaf requested a review from zeroshade as a code owner April 17, 2023 15:02
@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #35190 has been automatically assigned in GitHub to PR creator.

Comment on lines 768 to 773
func (r *Reader) parseBinaryType(field array.Builder, str string) {
// specialize the implementation when we know we cannot have nulls
if str != "" && r.isNull(str) {
if r.isNull(str) {
field.AppendNull()
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we currently have a test that verifies you can still get an empty string if it is not in the list of Null values? / stringsCanBeNull == false ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not, but happy to add one here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added two new test cases for this

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Apr 17, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Apr 17, 2023
kodiakhq bot pushed a commit to cloudquery/plugin-sdk that referenced this pull request Apr 18, 2023
This updates the Arrow dependency to the latest `cqmain` branch commit, which includes apache/arrow#35191 and apache/arrow#35189
@zeroshade zeroshade merged commit 16bd95f into apache:main Apr 18, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Apr 18, 2023
liujiacheng777 pushed a commit to LoongArch-Python/arrow that referenced this pull request May 11, 2023
…he#35191)

The way the CSV reader handles null values currently makes it impossible for empty strings to be interpreted as null, even when this is explicitly enabled through the `csv.WithNullReader(stringsCanBeNull: true, "", "NULL", "null")` option. This change fixes this.
* Closes: apache#35190

Authored-by: Herman Schaaf <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this pull request May 15, 2023
…he#35191)

The way the CSV reader handles null values currently makes it impossible for empty strings to be interpreted as null, even when this is explicitly enabled through the `csv.WithNullReader(stringsCanBeNull: true, "", "NULL", "null")` option. This change fixes this.
* Closes: apache#35190

Authored-by: Herman Schaaf <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
@candiduslynx candiduslynx deleted the csv-fix3 branch June 2, 2023 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Go] CSV Reader fails to read empty values as null

2 participants