Skip to content

ReadCsv cannot set empty string as missingValues #441

@zyzhu

Description

@zyzhu

Repro steps
sample.csv, c3 column is all empty

row,c1,c2,c3
1,,5,
2,4,6,

The following lines treat `` as missing value. It then access the row by key.

[<Literal>]
let sample = "C:/FSharp/sample.csv"
let r = Frame.ReadCsv(sample,missingValues=[|""|]).IndexRows<int>("row")
r.Rows.[2].As<float>()

Expected outcome

val r : Frame<int,string> =
  
     c1        c2 c3        
1 -> <missing> 5  <missing> 
2 -> 4         6  <missing> 

val it : Series<string,float> =
  
c1 -> 4         
c2 -> 6         
c3 -> <missing> 

Actual outcome

val r : Frame<int,string> =
     c1        c2 c3 
1 -> <missing> 5  	  
2 -> 4         6     

System.FormatException: Input string was not in a correct format.

Suggestion
Empty string cannot be set as missing values because of the following line. c3 column is inferred to be string even though it's set as one of the missingValues.
https://github.com/fsharp/FSharp.Data/blob/master/src/Csv/CsvInference.fs#L130

Wait till FSharp.Data address the following issue fsprojects/FSharp.Data#1192

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions