Skip to content

[Repo Assist] Deedle.Arrow Part 3: Frame/Series module API, extended tests and documentation#685

Merged
dsyme merged 2 commits intomasterfrom
repo-assist/arrow-part3-672-bbf11a2c5704ccb9
Mar 21, 2026
Merged

[Repo Assist] Deedle.Arrow Part 3: Frame/Series module API, extended tests and documentation#685
dsyme merged 2 commits intomasterfrom
repo-assist/arrow-part3-672-bbf11a2c5704ccb9

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist as Part 3 of the Deedle.Arrow integration.

Closes #672. Parent tracking issue: #671.

Summary

This PR completes the Deedle.Arrow integration with a module-based API, a comprehensive test suite, and new documentation pages.


Design change — Frame and Series modules

Following maintainer feedback (via #671 comment), the package now exposes idiomatic F# module APIs. After open Deedle.Arrow:

open Deedle
open Deedle.Arrow

// Convert between Deedle and Arrow in-memory
Frame.toRecordBatch : Frame<'R,'C> -> RecordBatch
Frame.ofRecordBatch : RecordBatch  -> Frame(int,string)

// Arrow IPC file format  (.arrow / .feather)
Frame.readArrow  : string -> Frame(int,string)
Frame.writeArrow : string -> Frame<'R,string> -> unit

// Arrow IPC stream format  (.arrows)
Frame.readArrowStream  : Stream -> Frame(int,string)
Frame.writeArrowStream : Stream -> Frame<'R,string> -> unit

// Row-key preservation
Frame.writeArrowWithIndex : string -> Frame<'R,string> -> unit
Frame.readArrowWithIndex  : string -> Frame(string,string)

// Feather v2 aliases
Frame.readFeather  : string -> Frame(int,string)
Frame.writeFeather : string -> Frame<'R,string> -> unit

And a Series module:

Series.toArrowArray : Series<'K,'V> -> IArrowArray
Series.ofArrowArray : IArrowArray   -> Series(int,obj)
```

The existing free functions (`readArrow`, `writeArrow`, `frameToRecordBatch`, etc.) are still present for backward compatibility.

---

## Extended test suite

All new tests are in `tests/Deedle.Arrow.Tests/Tests.fs`. Added **29 new tests** on top of the existing 24, bringing the total to **53 tests**:

| Category | Tests |
|---|---|
| Frame module API | 5 |
| Series module API | 5 |
| Edge cases | 8 |
| FsCheck property-based | 5 |
| (existing Part 1 + Part 2) | 30 |

Edge cases covered: single row, single column, all-missing column, mixed-type frame, int64 stream, float32 file, empty-string column.

FsCheck properties: `float` round-trip, `int` round-trip, `string` round-trip, Series round-trip, RecordBatch column-name preservation.

---

## New documentation pages

### `docs/arrow.fsx` — Apache Arrow and Feather integration
- Reading and writing Arrow / Feather files (`Frame.readArrow`, `Frame.writeArrow`, etc.)
- Arrow IPC stream format
- Converting to/from `RecordBatch`
- Series ↔ Arrow array conversion
- Row-key preservation with `writeArrowWithIndex` / `readArrowWithIndex`
- Full type mapping table
- Python / pyarrow interoperability examples
- NuGet package information

### `docs/joining.fsx` — Joining, merging and appending frames
- Outer / inner / left frame joins
- Lookup joins with `Lookup.ExactOrSmaller` / `ExactOrGreater`
- Series zipping with `Series.zip` / `Series.zipInto`
- Appending frames vertically with `Frame.merge`
- Merging sparse series with `Series.merge`

---

## Remaining documentation (filed separately)

Issue #684 tracks the remaining documentation gaps: missing-value handling, `Deedle.Math`, `Deedle.Excel`, grouping deep-dive, and C# API docs. Contributions are welcome.

---

## Test status

```
Deedle.Arrow.Tests   — Passed! Failed: 0, Passed: 53, Skipped: 0
Deedle.Tests (core)  — Passed! Failed: 0, Passed: 690, Skipped: 0

All tests pass with no regressions.

Generated by Repo Assist for issue #672

Generated by Repo Assist for issue #671 ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

- Add Frame module to Deedle.Arrow: Frame.readArrow, Frame.writeArrow,
  Frame.readArrowStream, Frame.writeArrowStream, Frame.toRecordBatch,
  Frame.ofRecordBatch, Frame.readFeather, Frame.writeFeather,
  Frame.writeArrowWithIndex, Frame.readArrowWithIndex
- Add Series module to Deedle.Arrow: Series.toArrowArray, Series.ofArrowArray
- Both modules are accessible after 'open Deedle.Arrow' as idiomatic F# module API
- Add FsCheck property-based tests (float, int, string, Series, RecordBatch)
- Add edge case tests: single row, single column, all-missing, mixed-type,
  int64 stream, float32 file, empty-string column
- Add Frame module API tests and Series module API tests
- Add docs/arrow.fsx: full Deedle.Arrow documentation page
- Add docs/joining.fsx: joining, merging and appending frames documentation

Total: 53 Arrow tests pass, 690 core tests pass.

Closes #672

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 21, 2026 10:24
@dsyme dsyme merged commit 0e03521 into master Mar 21, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/arrow-part3-672-bbf11a2c5704ccb9 branch March 21, 2026 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Repo Assist] [Deedle.Arrow] Part 3: Comprehensive tests and documentation

1 participant