Skip to content

[Repo Assist] Fix filterRows/filterRowValues/Where losing ColumnTypes when column is all-missing (issue #516)#601

Merged
dsyme merged 6 commits intomasterfrom
repo-assist/fix-issue-516-filterrows-preserves-columntypes-18b8652a98367afb
Mar 9, 2026
Merged

[Repo Assist] Fix filterRows/filterRowValues/Where losing ColumnTypes when column is all-missing (issue #516)#601
dsyme merged 6 commits intomasterfrom
repo-assist/fix-issue-516-filterrows-preserves-columntypes-18b8652a98367afb

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented Mar 9, 2026

🤖 This PR was created by Repo Assist, an automated AI assistant.

Fixes the bug reported in #516 where frame.ColumnTypes incorrectly changes to System.Object after filtering rows when the resulting column contains only missing values.

Root Cause

Frame.filterRows, Frame.filterRowValues, and Frame.Where all previously reconstructed the frame from rows (via FrameUtils.fromRowsAndColumnKeys). That function uses a "witness value" to determine the column type — it picks any non-missing value from the column to drive type dispatch:

let someValue =
  nested |> Series.observations |> Seq.tryPick (fun (_, v) ->
    v.TryGetObject(key) |> OptionalValue.asOption)
let someValue = defaultArg someValue (obj())  // ← falls back to obj() when all missing
columnCreator key someValue

When all values in a column are missing after filtering, Seq.tryPick returns None, the fallback becomes obj(), and the resulting column vector has ElementType = typeof(obj). Hence frame.ColumnTypes reports System.Object instead of the original type.

Fix

Replace the row-reconstruction path in filterRows, filterRowValues, and both Where overloads with the same column-wise approach already used by filterRowsBy and realignRows:

  1. Apply the predicate to collect the filtered row index.
  2. Use IndexBuilder.Reindex to produce a Relocate vector command that maps the new row addresses back to the original ones.
  3. Apply VectorHelpers.transformColumn to each column vector — this preserves the generic type parameter 'T of each column regardless of how many values are missing.

Also adds open Deedle.Vectors to FrameExtensions.fs so VectorConstruction.Return is in scope.

Behaviour preserved

  • All values (present or missing) are correctly mapped to their new positions.
  • Empty filter result still yields an empty frame with the same column keys and types.
  • No change in behaviour for columns where at least one value is non-missing.

Test

New regression test Filter rows preserves column types when all column values are missing in tests/Deedle.Tests/Frame.fs verifies that column ElementType remains float after filtering to rows where a column contains only NaN (missing) values.

Test Status

  • dotnet build src/Deedle/Deedle.fsproj -c Release — 0 errors
  • dotnet test tests/Deedle.Tests/Deedle.Tests.fsproj -c Release465 / 465 passed

⚠️ CI docs-generation (fsdocs) will fail on this PR because the XML doc issue in master (fixed by PR #597) has not yet been merged. This is an infrastructure failure unrelated to this change — all tests pass.

Closes #516

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

…es missing (#516)

Previously, Frame.filterRows, Frame.filterRowValues and Frame.Where all
reconstructed the frame from rows (via fromRowsAndColumnKeys), which loses
column type information when a column's values are all missing in the
filtered result. The witness-value lookup falls back to obj() when
Seq.tryPick returns None, giving ElementType = typeof<obj>.

Fix: replace the row-reconstruction path with the same column-wise
approach already used by filterRowsBy and realignRows: compute a Relocate
command via IndexBuilder.Reindex, then apply VectorHelpers.transformColumn
to each column vector. This preserves the vector's generic type parameter
T regardless of how many values are missing.

Added open Deedle.Vectors to FrameExtensions.fs so VectorConstruction.Return
is in scope.

Regression test: 'Filter rows preserves column types when all column values
are missing' verifies that column types remain float after filtering to rows
where a column contains only NaN.

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 9, 2026 04:03
@dsyme dsyme merged commit 033d6b1 into master Mar 9, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/fix-issue-516-filterrows-preserves-columntypes-18b8652a98367afb branch March 9, 2026 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

frame.ColumnTypes incorrectly changes to System.Object when filtering frame to rows where the column's values are null

1 participant