[Repo Assist] Fix CSV schema parsing for column names containing parentheses (closes #348) by github-actions[bot] · Pull Request #604 · fslaborg/Deedle

github-actions · 2026-03-09T03:31:34Z

🤖 This is an automated PR from Repo Assist.

Fixes the regex bug in CSV schema parsing reported in #348.

Root Cause

nameAndTypeRegex in CsvInference.fs was incorrectly using RegexOptions.RightToLeft. This caused the regex engine to scan right-to-left and match the first ( it found (scanning from the right), rather than the last ( from the left.

For a schema item like "Revenue (USD)(float)", the intended parse is:

name = "Revenue (USD)"
type = "float"

But with RightToLeft, the regex matched:

name = "Revenue " (stopped at the inner ( in (USD))
type = "USD)(float" (incorrect!)

Without RightToLeft, the standard greedy .+ in the name group correctly finds the last parenthesized group as the type, because it consumes as many characters as possible before backtracking.

Fix

Remove RegexOptions.RightToLeft from nameAndTypeRegex only (line 44). The other two regexes (typeAndUnitRegex and overrideByNameRegex) correctly retain RightToLeft because they need to handle cases like:

"float(unit<sub)>" — unit string can contain <
"Revenue->newName=float" — column names can contain =

Test Status

All 465 existing tests pass ✅. A new regression test is added:

[(Test)]
let ``Can read CSV with schema where column name contains parentheses``() =
  let csv = "Revenue (USD),Count\n100.0,1\n200.0,2"
  use reader = new System.IO.StringReader(csv)
  let df = Frame.ReadCsv(reader, schema="Revenue (USD)(float),Count(int)")
  List.ofSeq df.ColumnKeys |> shouldEqual ["Revenue (USD)"; "Count"]
  df.GetColumn(float)("Revenue (USD)") |> Series.values |> List.ofSeq |> shouldEqual [100.0; 200.0]

Trade-offs

None. The greedy (non-RightToLeft) behavior is the correct and intended behavior for nameAndTypeRegex. This is a pure bug fix with no behaviour changes for existing valid schemas.

Generated by Repo Assist · ◷

To install this agentic workflow, run
gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

…#348) The nameAndTypeRegex was incorrectly using RegexOptions.RightToLeft, which caused it to find the FIRST '(' scanning right-to-left rather than the LAST '(' from the left. This meant that a schema item like 'Revenue (USD)(float)' was parsed with name='Revenue ' and type='USD)(float' instead of name='Revenue (USD)', type='float'. The fix removes RightToLeft from nameAndTypeRegex only. The other two regexes (typeAndUnitRegex and overrideByNameRegex) correctly use RightToLeft because they need to handle cases like 'type<unit<sub>>' or 'name->newName=type' where the name itself can contain the delimiter characters. A regression test is added to cover this case. Co-authored-by: Copilot <[email protected]>

…s-4c86d7d41f6a1513

github-actions Bot added automation repo-assist labels Mar 9, 2026

ci: trigger checks

ec3bab4

github-actions Bot added the repo-assist label Mar 9, 2026

Merge branch 'master' into repo-assist/fix-issue-348-csv-schema-paren…

82bd0d4

…s-4c86d7d41f6a1513

github-actions Bot mentioned this pull request Mar 9, 2026

[Repo Assist] Monthly Activity 2026-03 #584

Closed

9 tasks

dsyme marked this pull request as ready for review March 9, 2026 04:00

dsyme merged commit 1e1a4c6 into master Mar 9, 2026
2 checks passed

dsyme deleted the repo-assist/fix-issue-348-csv-schema-parens-4c86d7d41f6a1513 branch March 9, 2026 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Repo Assist] Fix CSV schema parsing for column names containing parentheses (closes #348)#604

[Repo Assist] Fix CSV schema parsing for column names containing parentheses (closes #348)#604
dsyme merged 3 commits intomasterfrom
repo-assist/fix-issue-348-csv-schema-parens-4c86d7d41f6a1513

github-actions Bot commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented Mar 9, 2026

Root Cause

Fix

Test Status

Trade-offs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant