Skip to content

[Repo Assist] Add Deedle.MicrosoftML – ML.NET IDataView integration package#677

Merged
dsyme merged 2 commits intomasterfrom
repo-assist/feat-microsoftml-563-3a506bf6ad985c5f
Mar 19, 2026
Merged

[Repo Assist] Add Deedle.MicrosoftML – ML.NET IDataView integration package#677
dsyme merged 2 commits intomasterfrom
repo-assist/feat-microsoftml-563-3a506bf6ad985c5f

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist in response to issue #563.

Summary

This PR adds a new Deedle.MicrosoftML package that bridges Deedle's Frame<'R, string> / Series<'K, 'V> types with ML.NET's IDataView pipeline infrastructure.

What's included

src/Deedle.MicrosoftML/

File Purpose
DataView.fs FrameDataView (custom IDataView) + Frame.toDataView / Frame.ofDataView
Transforms.fs Pipeline.* helpers: fitEstimator, fitTransform, fitTransformOn, applyTransformer
Extensions.fs C#-friendly [(Extension)] methods: ToDataView, Transform, FitTransform

tests/Deedle.MicrosoftML.Tests/

17 tests covering:

  • Schema inference for numeric, string, bool columns
  • Cursor iteration + correct values
  • Round-trip Frame → IDataView → Frame
  • Missing value propagation (NaN / empty string)
  • ML.NET NormalizeMinMax transform integration
  • fitEstimator / applyTransformer separation of concerns
  • fitTransformOn (train on one frame, transform another)
  • Extension methods

Other changes

  • paket.dependencies – adds nuget Microsoft.ML (resolved to 5.0.0)
  • Deedle.sln – adds both new projects

Usage (F#)

open Deedle
open Deedle.MicrosoftML
open Microsoft.ML

let mlc = MLContext(seed = 1)

// Convert a Deedle frame to IDataView for ML.NET
let dv : IDataView = Frame.toDataView myFrame

// Fit and apply a normaliser, get result back as a Frame
let normalised : Frame(int, string) =
    Pipeline.fitTransform (mlc.Transforms.NormalizeMinMax("Price_norm", "Price")) myFrame

// Convert any ML.NET IDataView output back to Deedle
let result : Frame(int, string) = Frame.ofDataView someDataView

Usage (C#)

using Deedle;
using Deedle.MicrosoftML;

var dv      = frame.ToDataView();
var result  = frame.FitTransform(mlc.Transforms.NormalizeMinMax("X_norm", "X"));
```

## Column type mapping

| Deedle type | ML.NET type |
|---|---|
| `float` | `Double` |
| `float32` | `Single` |
| `int` | `Int32` |
| `int64` | `Int64` |
| `bool` | `Boolean` |
| `string` | `Text` (`ReadOnlyMemory(char)`) |
| Missing float | `NaN` |
| Missing string | `""` |

## Test status

```
✅ 17 / 17 tests pass (Deedle.MicrosoftML.Tests)685 / 685 tests pass (Deedle.Tests – no regressions)

Closes #563
Epic: #676

Generated by Repo Assist for issue #563 ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

- Implements FrameDataView: a custom IDataView wrapping a Deedle Frame<'R, string>
  for use directly with ML.NET pipelines. Supports Double, Single, Int32, Int64,
  Boolean, and Text column types; missing values become NaN / empty string.
- Frame.toDataView  - converts a Deedle Frame to an IDataView
- Frame.ofDataView  - converts an IDataView back to a Frame<int, string>
- Pipeline.fitEstimator / fitTransform / fitTransformOn - fit ML.NET estimators
  and/or apply transformers to Deedle frames
- Pipeline.applyTransformer - apply an already-fitted ITransformer to a Frame
- C#-friendly extension methods (ToDataView, Transform, FitTransform) via
  FrameMLExtensions
- 17 tests covering schema inference, cursor iteration, round-trip conversions,
  missing value propagation, and ML.NET normalisation transforms

Closes #563

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 19, 2026 23:44
@dsyme dsyme merged commit d96fb23 into master Mar 19, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/feat-microsoftml-563-3a506bf6ad985c5f branch March 19, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement IDataView interface

1 participant