Skip to content

[Repo Assist] Deedle.MicrosoftML: vector column support (VBuffer(float32) / VBuffer(float))#680

Merged
dsyme merged 2 commits intomasterfrom
repo-assist/fix-issue-676-microsoftml-vector-columns-a455f47d2f9185aa
Mar 21, 2026
Merged

[Repo Assist] Deedle.MicrosoftML: vector column support (VBuffer(float32) / VBuffer(float))#680
dsyme merged 2 commits intomasterfrom
repo-assist/fix-issue-676-microsoftml-vector-columns-a455f47d2f9185aa

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist, continuing work on the Deedle.MicrosoftML integration epic (#676).

Summary

Adds vector column support to Deedle.MicrosoftML — the key missing piece that enables real ML.NET feature-engineering pipelines (e.g. Concatenate, FastTree output columns).

What's included

src/Deedle.MicrosoftML/DataView.fs

Direction Deedle type ML.NET type
toDataView float32 array column VectorDataViewType(Single, N)
ofDataView VBuffer(float32) column float32 array series
ofDataView VBuffer(float) column float array series

The VBuffer(float) round-trip handles the common case where scalar float inputs are fed through transforms like Concatenate — ML.NET keeps the Double type in the output vector.

tests/Deedle.MicrosoftML.Tests/

5 new tests (22 total):

  • Schema check: float32 array column → VectorDataViewType(Single, 3)
  • Cursor values: VBuffer(float32) getter reads correct values
  • Round-trip: float32 array column survives toDataViewofDataView
  • Mixed frame: scalar + vector columns both survive round-trip
  • End-to-end: Concatenate transform output readable as float array

Usage

open Deedle
open Deedle.MicrosoftML
open Microsoft.ML

let mlc = MLContext(seed = 1)

// Pass pre-computed feature vectors to ML.NET
let frameWithVectors : Frame(int, string) =
    Frame.ofColumns [
        "Label",    Series.ofValues [ 1.0; 0.0; 1.0 ] :> ISeries(int)
        "Features", Series.ofValues [
            [| 1.0f; 0.2f; 0.8f |]
            [| 0.1f; 0.9f; 0.3f |]
            [| 0.7f; 0.4f; 0.6f |]
        ] :> ISeries(int)
    ]

let dv = Frame.toDataView frameWithVectors
// → Features column has type VectorDataViewType(Single, 3) ✓

// Use Concatenate to build a feature column, then get result back as Deedle Frame
let pipe   = mlc.Transforms.Concatenate("Features", [| "A"; "B"; "C" |])
let result = Pipeline.fitTransform pipe myScalarFrame
let feats  = result.GetColumn(float array)("Features")
```

## Test status

```
✅ 22 / 22 tests pass (Deedle.MicrosoftML.Tests)

Closes part of #676 (vector column support from the Future Work list)

Generated by Repo Assist for issue #676 ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

…ffer<float>)

Adds read/write support for ML.NET vector columns in the Deedle.MicrosoftML
integration package:

- toDataView: float32 array series → VectorDataViewType(Single, N) columns
- ofDataView: VBuffer<float32> columns → float32 array series
- ofDataView: VBuffer<float> columns → float array series (handles output of
  transforms like Concatenate whose inputs are scalar doubles)

Adds 5 new tests (22 total):
- Schema check for VectorDataViewType
- Cursor reading of VBuffer<float32> values
- Round-trip float32 array vector column
- Mixed scalar + vector frame round-trip
- Concatenate transform → float array column (end-to-end pipeline test)

Epic: #676

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 20, 2026 12:40
@dsyme dsyme merged commit 371c2be into master Mar 21, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/fix-issue-676-microsoftml-vector-columns-a455f47d2f9185aa branch March 21, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Repo Assist] Epic: Deedle.MicrosoftML integration package

1 participant