Skip to content

[Repo Assist] Add typeResolver parameter to ReadCsv (implements #454)#649

Merged
dsyme merged 2 commits intomasterfrom
repo-assist/fix-issue-454-typeresolver-param-80bfe3824a4fc0e4
Mar 18, 2026
Merged

[Repo Assist] Add typeResolver parameter to ReadCsv (implements #454)#649
dsyme merged 2 commits intomasterfrom
repo-assist/fix-issue-454-typeresolver-param-80bfe3824a4fc0e4

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Repo Assist — implementing #454 as requested by @dsyme.

Summary

This PR adds an optional typeResolver parameter to all F#-friendly Frame.ReadCsv and Frame.ReadCsvString overloads, implementing the feature proposed in #454.

What it does

Frame.ReadCsv("sample.csv", typeResolver = fun col ->
    match col with
    | c when c.EndsWith("Id")   -> Some "guid"
    | c when c.EndsWith("Date") -> Some "date"
    | c when c.StartsWith("Is") -> Some "bool"
    | _                         -> None)
```

The resolver is called with each column header name after the CSV has been loaded. Returning `Some "typeName"` (e.g. `"int"`, `"float"`, `"string"`, `"bool"`, `"date"`, `"guid"`, `"decimal"`, `"int64"`, `"timespan"` — same names accepted by the `schema` parameter) pins that column to the given type. Returning `None` lets Deedle infer the type as normal.

## API surface

- Added `?typeResolver: string -> string option` to:
  - `Frame.ReadCsv(path, ...)`
  - `Frame.ReadCsv(path, indexCol, ...)`
  - `Frame.ReadCsv(stream, ...)`
  - `Frame.ReadCsv(reader, ...)`
  - `Frame.ReadCsvString(csvString, ...)`
- The C# `[(Optional)]` overloads are unchanged (F# function types are not idiomatic as optional C# parameters; C# callers can still use the `schema` string parameter).

## Interaction with `schema`

When both `typeResolver` and `schema` are provided, the explicit `schema` takes precedence for any column it names. The implementation deduplicates entries so the combined schema string never exceeds the column count (which would otherwise trigger a parse error in the underlying `InferColumnTypes`).

## Implementation

- **`FrameUtils.fs`**: `readCsv` and `readString` gain a final `(string -> string option) option` parameter. Inside `readCsv`, after `CsvFile.Load`, the resolver is applied to produce `ColName=Type` named-override entries, which are merged with the explicit `schema` string and passed to the existing `InferColumnTypes` machinery.
- **`FrameExtensions.fs`**: All F# optional-parameter `ReadCsv`/`ReadCsvString` overloads forward the new parameter; C# overloads pass `None`.

## Test Status

```
dotnet test tests/Deedle.Tests/Deedle.Tests.fsproj -c Release
Passed!  Failed: 0, Passed: 657, Skipped: 0, Total: 657

Two new tests added:

  • typeResolver maps column names to types — basic resolver usage
  • typeResolver explicit schema takes precedence over typeResolver — conflict resolution

Closes #454

Generated by Repo Assist for issue #454 ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

Add an optional 'typeResolver' parameter (type: string -> string option)
to all F#-friendly ReadCsv/ReadCsvString overloads and the underlying
readCsv/readString functions in FrameUtils.

The resolver is called with each column header name. Returning Some "type"
(e.g. Some "int", Some "float", Some "bool") pins that column's type;
returning None lets Deedle infer it as normal.

When both 'typeResolver' and 'schema' are provided, the explicit 'schema'
takes precedence for any column it names, preventing duplicate schema
entries from exceeding the column-count limit in the schema parser.

Implementation: the resolver generates 'ColName=Type' named-override
entries for all applicable columns, filters out any column already
covered by the explicit schema, and passes the combined string to the
existing InferColumnTypes machinery.

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 18, 2026 16:22
@dsyme dsyme merged commit 10e6774 into master Mar 18, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/fix-issue-454-typeresolver-param-80bfe3824a4fc0e4 branch March 18, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: ReadCsv schema provider as lambda function

1 participant