Skip to content

[Repo Assist] Add Stats.cov and Stats.corr for pairwise series covariance and correlation#614

Merged
dsyme merged 3 commits intomasterfrom
repo-assist/improve-stats-cov-corr-546efa4950b0f241
Mar 14, 2026
Merged

[Repo Assist] Add Stats.cov and Stats.corr for pairwise series covariance and correlation#614
dsyme merged 3 commits intomasterfrom
repo-assist/improve-stats-cov-corr-546efa4950b0f241

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Repo Assist — automated AI improvement (Task 5: Coding Improvements).

Summary

Implements Stats.cov (sample covariance) and Stats.corr (Pearson correlation coefficient) for pairs of Series, completing the long-standing TODO at the top of Stats.fs:

// TODO: still to do, possibly: median, percentile, corr, cov

median, percentile, and describe were added in previous PRs; this PR adds the remaining two.

API

// Sample covariance of two series (aligned on keys, inner join)
Stats.cov s1 s2 : float

// Pearson correlation coefficient of two series
Stats.corr s1 s2 : float

Both functions:

  • Align the two series on their keys via inner join (same semantics as ZipInner)
  • Skip keys where either series is missing
  • Return NaN when fewer than 2 aligned pairs exist
  • Stats.corr additionally returns NaN when either series has zero variance (constant series)

Example

let prices = series [ "MSFT" => 300.0; "AAPL" => 180.0; "GOOG" => 140.0 ]
let returns = series [ "MSFT" =>  0.05; "AAPL" =>  0.03; "GOOG" =>  0.04 ]

Stats.cov prices returns    // sample covariance
Stats.corr prices returns   // Pearson r
```

## Design

The implementation follows the same conventions as all other `Stats` members: `inline`, single-pass via `Array.ofSeq`, and operates on the `Series.ZipInner` result so that missing-value semantics are handled consistently with the rest of the library.

## Test Status

Build and all 515 tests pass (508 existing + 7 new, covering correct values, NaN edge cases, key alignment, perfect/anti-correlation, and zero-variance handling).

```
Passed!Failed: 0, Passed: 515, Skipped: 0, Total: 515

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@30f2254f2a7a944da1224df45d181a3f8faefd0d

Implements Stats.cov (sample covariance) and Stats.corr (Pearson correlation
coefficient) for pairs of Series, completing the long-standing TODO in Stats.fs.

Both functions:
- Align the two input series by key using an inner join
- Skip keys where either series has a missing value
- Return NaN when fewer than 2 aligned pairs are available
- Follow the same style and XML-doc conventions as existing Stats members

Stats.corr additionally returns NaN when either series has zero variance
(constant series), avoiding division by zero.

Also adds 7 unit tests covering: correct values, NaN edge cases, key alignment,
perfect correlation, anti-correlation, and zero-variance detection.

Co-authored-by: Copilot <[email protected]>
@dsyme dsyme marked this pull request as ready for review March 14, 2026 00:31
@dsyme dsyme merged commit 7e9ff01 into master Mar 14, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/improve-stats-cov-corr-546efa4950b0f241 branch March 14, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant