FrankenNumPy

Memory-safe, clean-room NumPy reimplementation in Rust.
Zero unsafe code. 2,426 tests. Bit-exact RNG parity. Every feature family green.

The Problem

NumPy is the backbone of scientific Python, but it carries 30 years of C/C++ legacy: buffer overflows in parsers, undefined behavior in edge cases, opaque stride semantics, and no formal compatibility contracts. Rewriting pieces in Cython or C++ perpetuates the same class of memory bugs.

The Solution

FrankenNumPy rebuilds NumPy's semantics from scratch in safe Rust with two non-negotiable goals:

Absolute behavioral compatibility with legacy NumPy. Not a subset, not "inspired by." The full API, edge cases and all.
Rigorous architecture with formal contracts, dual-mode runtime (strict/hardened), and differential conformance against a NumPy oracle.

Why FrankenNumPy?

	NumPy (C)	FrankenNumPy (Rust)
Memory safety	Buffer overflows possible	`#![forbid(unsafe_code)]` on all 9 crates
RNG parity	Reference implementation	Bit-exact match (40 oracle-verified distributions)
NaN semantics	Implicit C behavior	Explicit propagation verified by 20+ oracle tests
Stride calculus	Evolved over decades	Clean-room deterministic engine (SCE)
Runtime modes	Single mode	Strict (max compat) + Hardened (safety guards)
Conformance	Self-referential	Differential oracle against real NumPy
Test coverage	pytest suite	2,358 Rust tests + 8-gate CI topology

Quick Example

use fnp_ufunc::{UFuncArray, DType, BinaryOp};

// Create arrays
let data = UFuncArray::new(vec![5], vec![1.0, 2.0, 3.0, 4.0, 5.0], DType::F64)?;

// Z-score normalization: (data - mean) / std
let mean = data.reduce_mean(None, false)?;
let std = data.reduce_std(None, false, 0)?;
let z = data.elementwise_binary(&mean.broadcast_to(&[5])?, BinaryOp::Sub)?
            .elementwise_binary(&std.broadcast_to(&[5])?, BinaryOp::Div)?;
// z = [-1.414, -0.707, 0.0, 0.707, 1.414]

// Sort, cumsum, percentile chain
let sorted = data.sort(None, None)?;           // [1, 2, 3, 4, 5]
let cumsum = sorted.cumsum(None)?;             // [1, 3, 6, 10, 15]
let median = data.percentile(50.0, None)?;     // 3.0

// Linear algebra
let a = UFuncArray::new(vec![2, 2], vec![3.0, 1.0, 1.0, 2.0], DType::F64)?;
let b = UFuncArray::new(vec![2], vec![9.0, 8.0], DType::F64)?;
let x = a.solve(&b)?;                         // [2.0, 3.0]

use fnp_random::Generator;

// Bit-exact NumPy-compatible RNG
let mut rng = Generator::from_pcg64_dxsm(12345)?;
let normals = rng.standard_normal(1000);        // Identical to NumPy's output
let samples = rng.binomial(10, 0.5, 100);       // BTPE algorithm, same as NumPy
let perm = rng.permutation(&data)?;             // Fisher-Yates via random_interval

Design Philosophy

Parity debt, not feature cuts. Every behavioral gap with NumPy is tracked as debt to be closed, not as an accepted scope reduction.

Stride Calculus Engine (SCE). All shape transformations (broadcast, reshape, transpose, view aliasing) flow through a single deterministic engine. This is the non-negotiable compatibility kernel.

Dual-mode runtime. Strict mode maximizes observable NumPy compatibility. Hardened mode adds safety guards and bounded defensive recovery. Both modes log decisions to an evidence ledger.

Fail-closed by default. Unknown wire formats, unrecognized metadata, and incompatible features cause explicit errors, not silent corruption.

Oracle-verified. Every RNG distribution, every linalg decomposition, and every reduction edge case is tested against NumPy's actual output from the same seed.

API Surface

Over 1,000 public functions across 9 crates covering the full NumPy API:

Category	Functions (highlights)
Array creation	`zeros`, `ones`, `empty`, `full`, `arange`, `linspace`, `logspace`, `geomspace`, `eye`, `identity`, `diag`, `meshgrid`, `fromfunction`, `array`
Shape manipulation	`reshape`, `ravel`, `flatten`, `transpose`, `swapaxes`, `expand_dims`, `squeeze`, `broadcast_to`, `concatenate`, `stack`, `vstack`, `hstack`, `split`, `tile`, `repeat`, `pad`
Unary math	`abs`, `sqrt`, `exp`, `log`, `sin`, `cos`, `tan`, `arcsin`, `arccos`, `arctan`, `sinh`, `cosh`, `tanh`, `floor`, `ceil`, `round`, `sign`, `isnan`, `isinf`, `isfinite` (42 total)
Binary math	`add`, `subtract`, `multiply`, `divide`, `floor_divide`, `remainder`, `power`, `fmod`, `arctan2`, `gcd`, `lcm`, `bitwise_and/or/xor`, comparisons (34 total)
Reductions	`sum`, `prod`, `min`, `max`, `mean`, `var`, `std`, `argmin`, `argmax`, `cumsum`, `cumprod`, `all`, `any`, `count_nonzero`, `nansum`, `nanmin`, `nanmax`, `ptp` (22 total)
Sorting	`sort`, `argsort`, `searchsorted`, `partition`, `argpartition`, `unique`, `unique_all/counts/inverse/values`
Set operations	`union1d`, `intersect1d`, `setdiff1d`, `setxor1d`, `in1d`, `isin`
Linear algebra	`solve`, `det`, `inv`, `eig`, `svd`, `qr`, `cholesky`, `lstsq`, `norm`, `matrix_rank`, `matrix_power`, `multi_dot`, `pinv`, `cond`, `slogdet`, `funm`
Random	39 statistical distributions plus permutation/state helpers via PCG64DXSM: `normal`, `uniform`, `binomial`, `poisson`, `gamma`, `beta`, `hypergeometric`, `multinomial`, `dirichlet`, `vonmises`, `zipf`, etc.
FFT	`fft`, `ifft`, `fft2`, `ifft2`, `fftn`, `ifftn`, `rfft`, `irfft`, `fftfreq`, `rfftfreq`, `fftshift`
Statistics	`histogram`, `percentile`, `quantile`, `median`, `average`, `corrcoef`, `cov`, `bincount`, `digitize`
Polynomials	Chebyshev, Legendre, Hermite, Laguerre families + power series (33 functions)
String arrays	33 `numpy.char` functions
Financial	`fv`, `pv`, `pmt`, `ppmt`, `ipmt`, `nper`, `rate`, `npv`, `irr`, `mirr`
I/O	`load`, `save`, `savez`, `savez_compressed`, `loadtxt`, `savetxt`, `genfromtxt`, `fromfile`, `tofile`
Masked arrays	`MaskedArray` with reshape, transpose, concatenate, comparison ops, `filled`, `compressed`, `anom`
Datetime	`DatetimeArray`, `TimedeltaArray` with arithmetic, `busday_count`, `busday_offset`, `is_busday`
Misc	`einsum`, `tensordot`, `kron`, `dot`, `matmul`, `outer`, `inner`, `vdot`, `convolve`, `correlate`, `gradient`, `diff`, `interp`, `clip`, `where`, `select`, `piecewise`

Installation

# Clone and build
git clone https://github.com/Dicklesworthstone/franken_numpy.git
cd franken_numpy
cargo build --workspace

# Run all tests
cargo test --workspace

# Run with all features
cargo test --workspace --all-features

Requirements: Rust nightly (pinned in rust-toolchain.toml to nightly-2026-02-20).

Architecture

           ┌──────────────────────────────────────────────┐
           │              User API Layer                  │
           │    UFuncArray · MaskedArray · StringArray    │
           └───────────────────────┬──────────────────────┘
                                  │
      ┌───────────┬───────────────┼──────────────┬───────────┐
      ▼           ▼               ▼              ▼           ▼
 ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
 │ fnp-ufunc│ │fnp-linalg│ │fnp-random│ │  fnp-io  │ │ fnp-fft  │
 │ 1421     │ │ 205      │ │ 191      │ │ 154      │ │ (in-ufunc│
 │ tests    │ │ tests    │ │ tests    │ │ tests    │ │  module) │
 └─────┬────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
       │
 ┌─────┴────────────────────────────────────────────────────┐
 │                                                          │
 ▼                                                          ▼
 ┌──────────┐ ┌────────────┐ ┌──────────┐ ┌──────────────┐
 │ fnp-dtype│ │ fnp-ndarray│ │ fnp-iter │ │ fnp-runtime  │
 │ 117 tests│ │ 55 tests   │ │ 103 tests│ │ 54 tests     │
 │ promote  │ │ SCE core   │ │ transfer │ │ strict/      │
 │ cast     │ │ strides    │ │ semantics│ │ hardened     │
 └──────────┘ └────────────┘ └──────────┘ └──────────────┘
                    ▲
                    │
         Stride Calculus Engine (SCE)
         - shape -> strides (C/F)
         - broadcast legality
         - reshape with -1 inference
         - alias-safe view transitions

The Stride Calculus Engine is the non-regression kernel. Every shape transformation (broadcast, reshape, transpose, view creation) is validated through SCE before execution.

How It Works: Deep Dive

Dtype System (`fnp-dtype`)

The type system is the foundation everything else builds on. 18 dtype variants cover the full NumPy type hierarchy:

Bool  I8  I16  I32  I64  U8  U16  U32  U64
F16  F32  F64  Complex64  Complex128
Str  DateTime64  TimeDelta64  Structured

Each dtype maps to a type-safe ArrayStorage variant with native Rust containers. No flattened Vec<u8> reinterpretation. I64 values live in Vec<i64>, Complex128 values live in Vec<(f64, f64)>, and F16 values use the half crate's f16 type. This means integer fidelity is preserved (no silent truncation through f64 for i64 values > 2^53), f32 identity is maintained, and complex numbers are stored as native interleaved pairs.

Promotion table. The promote(lhs, rhs) function implements NumPy's exact promotion rules as a deterministic const fn:

LHS	RHS	Result	Why
Bool	anything	RHS	Bool is identity for promotion
U8	I8	I16	Smallest signed type covering 0..255 and -128..127
U16	I16	I32	Smallest signed type covering both ranges
U64	any signed	F64	No integer type holds both U64.max and negative values
F16	I8/U8	F16	NumPy preserves float16 for 8-bit ints
F16	I16/U16	F32	16-bit integers exceed float16 mantissa
F32	I32/I64	F64	float32 can't represent all 32/64-bit ints
Complex64	I32	Complex128	Mirrors the F32+I32 → F64 widening rule

The U64+signed→F64 promotion is the most counterintuitive rule in the table, but it's what NumPy does: there simply isn't a 128-bit integer type in NumPy's type system.

Cast policy. can_cast(src, dst, casting) supports all five NumPy casting modes: "no", "equiv", "safe", "same_kind", and "unsafe".

Stride Calculus Engine (`fnp-ndarray`)

The SCE owns all shape transformation rules and is the correctness backbone of the entire project:

Shape → strides. Given a shape [d0, d1, ..., dn] and memory order (C or F), compute contiguous strides. C-order: rightmost dimension has stride = item_size, each dimension to the left multiplies by the next dimension's size. F-order: leftmost dimension has stride = item_size.
Broadcast legality. Two shapes are broadcast-compatible when, aligned from the right, each pair of dimensions is either equal or one is 1. The output shape takes the maximum at each position. FrankenNumPy handles mixed-rank broadcasting (different number of dimensions) by left-padding the shorter shape with 1s.
Reshape with -1 inference. At most one dimension can be -1, meaning "infer from the total element count and the other dimensions." The engine divides the total element count by the product of known dimensions and validates that the result is exact (no remainder).
View safety. as_strided() and sliding_window_view() compute the required byte span from the minimum to maximum reachable offset, then verify it fits within the base allocation. Negative strides (for reverse slicing) are fully supported.
Broadcast strides. When a dimension has size 1 but the broadcast output has size > 1, the stride for that dimension is set to 0. This creates a "virtual repeat" without copying data.

Dual-Mode Runtime (`fnp-runtime`)

The runtime implements a Bayesian decision engine that chooses between three actions for each compatibility check:

Action	When Used
Allow	KnownCompatible input in Strict mode, or low-risk KnownCompatible in Hardened mode
FullValidate	Hardened mode with elevated risk, or malformed metadata (NaN/out-of-range inputs)
FailClosed	Unknown semantics or KnownIncompatible in any mode

Bayesian risk scoring. Each decision starts with class-specific prior odds (1% incompatibility for KnownCompatible, 99% for KnownIncompatible, 50% for Unknown). A risk score is computed as the log-likelihood ratio of the evidence against a threshold. The posterior incompatibility probability determines the action via a loss model.

The loss model encodes the asymmetric costs:

allow_if_compatible:       0.0    (correct accept: no cost)
allow_if_incompatible:   100.0    (silent corruption: catastrophic)
full_validate_if_compatible: 4.0  (unnecessary work: small cost)
full_validate_if_incompatible: 2.0 (caught by validation: acceptable)
fail_closed_if_compatible: 125.0  (false rejection: high cost)
fail_closed_if_incompatible: 1.0  (correct rejection: minimal cost)

Every decision is logged to an EvidenceLedger with timestamp, mode, class, evidence terms, and action taken. Override requests are tracked separately in OverrideAuditEvent records with audit references.

Ufunc Dispatch (`fnp-ufunc`)

The ufunc engine is the largest crate (~30,000 lines) and implements the array operation layer:

35 binary operations — Add, Sub, Mul, Div, Power, FloorDivide, Remainder, Fmod, Minimum, Maximum, Fmax, Fmin, Copysign, Heaviside, Nextafter, Arctan2, Hypot, Logaddexp, Logaddexp2, Ldexp, FloatPower, BitwiseAnd/Or/Xor, LeftShift, RightShift, LogicalAnd/Or/Xor, and 6 comparison ops.

42 unary operations — Abs, Negative, Sqrt, Exp, Log, all trig and hyperbolic functions, Floor, Ceil, Round, Cbrt, Expm1, Log1p, Spacing, Signbit, Isnan, Isinf, Isfinite, Invert, and more.

Every binary operation handles broadcasting through the same code path: compute the output shape via SCE's broadcast_shape, map output indices to source indices using broadcast strides (0-stride for size-1 dimensions), and apply the operation elementwise.

NaN semantics are explicit. After fixing 18 NaN-handling bugs found through systematic oracle testing:

reduce_min, reduce_max: propagate NaN (use custom nan_min/nan_max, not f64::min/f64::max)
cummin, cummax: NaN propagates through the accumulator
sort, argsort: NaN sorts to end via nan_last_cmp (not partial_cmp().unwrap_or(Equal))
median, percentile, quantile: NaN early-returns before sorting
ptp: NaN propagates in both UFuncArray and MaskedArray variants

Float error handling. A thread-local FloatErrorState tracks divide-by-zero, overflow, underflow, and invalid operation events. Each mode (Ignore, Warn, Raise, Call, Print, Log) can be configured independently. seterr(), geterr(), and errstate() match NumPy's API.

Generalized ufuncs. GufuncSignature parses signatures like "(n?,k),(k,m?)->(n?,m?)" for operations that act on sub-arrays (e.g., matrix multiply). The parser normalizes signatures, extracts core dimensions, and validates operand shapes against the pattern.

einsum. Three implementations: basic einsum() for direct evaluation, einsum_path() for contraction path optimization, and einsum_optimized() for greedy or brute-force strategy selection.

Random Number Generation (`fnp-random`)

The RNG crate achieves bit-exact parity with NumPy by porting every algorithm from NumPy's C source code.

Bit generators. Three implementations:

Pcg64DxsmRng — the default. State is 128-bit with 128-bit increment. Uses the DXSM output function and cheap multiplier for generation. Seeding goes through SeedSequence which applies Melissa O'Neill's seed_seq design with SplitMix64 mixing.
Mt19937Rng — Mersenne Twister for legacy RandomState compatibility. 624-word state array with Matsumoto-Nishimura twist.
DeterministicRng — simple SplitMix64-based generator for testing.

SeedSequence. Hierarchical seeding with spawn/generate_state contracts. spawn(n) creates child SeedSequences by incorporating a monotonic spawn counter into the entropy pool. generate_state(words) cycles through the pool with hash transformations to produce initialization vectors. This exactly matches NumPy's numpy.random.SeedSequence.

Bounded integers. NumPy's random_bounded_uint64() dispatch:

Range fits in 32 bits → Lemire's method via buffered next_uint32() (each u64 is split into two u32s, low first)
Range exceeds 32 bits → Lemire's method with 128-bit multiplication
Range = 0 → return 0; Range = u32::MAX → raw next_uint32(); Range = u64::MAX → raw next_u64()

Shuffle/permutation. Uses random_interval() (masked bit rejection, not Lemire), matching NumPy's _shuffle_raw code path. For each index from n-1 down to 1, generate a uniform random integer in [0, i] by masking to the smallest bit-width covering i and rejecting values > i.

Linear Algebra (`fnp-linalg`)

93 public functions organized into three tiers:

2x2 fast paths. solve_2x2, det_2x2, inv_2x2, qr_2x2, svd_2x2, eigh_2x2, cholesky_2x2 avoid the overhead of general NxN algorithms for the smallest matrix size.

NxN general algorithms:

QR decomposition — Householder reflections with qr_nxn (reduced) and qr_mxn (rectangular)
SVD — Golub-Kahan bidiagonalization + implicit shifted QR
Eigenvalues — Hessenberg reduction + implicit shifted QR iteration for real Schur form
Symmetric eigenvalues — Tridiagonal reduction + implicit QL shifts (eigvalsh_nxn, eigh_nxn)
LU factorization — Partial pivoting with lu_factor_nxn and lu_solve
Cholesky — Column-wise lower-triangular factorization
Least squares — Normal equations via A^T A for lstsq_nxn

Spectral methods: expm_nxn (matrix exponential via Pade approximation), sqrtm_nxn (matrix square root), logm_nxn (matrix logarithm), funm_nxn (general matrix function via Schur decomposition), polar_nxn (polar decomposition U*P), schur_nxn (Schur triangular form).

Batch operations. 14 batch functions (batch_inv, batch_det, batch_solve, batch_svd, batch_qr, batch_cholesky, batch_eig, batch_eigvalsh, batch_eigh, batch_slogdet, batch_svd_full, batch_matrix_norm, batch_matrix_rank, batch_trace) operate on stacked matrices with leading batch dimensions, matching NumPy's broadcasting semantics for linear algebra.

Complex number support. 15+ functions for complex-valued matrices: complex_solve_nxn, complex_det_nxn, complex_inv_nxn, complex_cholesky_nxn, complex_qr_mxn, complex_matmul, complex_matvec, complex_conjugate_transpose, complex_matrix_norm_frobenius, complex_trace_nxn.

I/O Format Handling (`fnp-io`)

NPY format. Complete implementation of NumPy's .npy binary format:

Magic prefix \x93NUMPY + version (1.0 or 2.0) + header length + Python dict header + padding to 16-byte alignment + raw data payload
24 supported dtype descriptors including little/big-endian variants for all integer and float types, complex types, fixed-width byte strings, Unicode strings, and object type
Header parsing validates all three required fields (descr, fortran_order, shape) and rejects unknown keys
Hardened with MAX_HEADER_BYTES = 65,536 to prevent allocation bombs

NPZ format. ZIP-based container for multiple arrays:

savez (stored) and savez_compressed (DEFLATE) via the flate2 crate
MAX_ARCHIVE_MEMBERS = 4,096 and MAX_ARCHIVE_UNCOMPRESSED_BYTES = 2 GB limits prevent archive bombs
Each member is a complete .npy file within the ZIP

Text I/O. loadtxt, savetxt, and genfromtxt handle delimiter-separated text files with configurable dtypes, missing value handling, and column selection (usecols).

Pickle policy. Object dtype arrays require explicit opt-in via allow_pickle=true, matching NumPy's security posture against arbitrary code execution from untrusted .npy files.

Conformance Infrastructure (`fnp-conformance`)

The conformance crate is the quality backbone with four layers:

Differential harness. Captures NumPy's output for a corpus of input fixtures, then runs the same inputs through FrankenNumPy and compares shapes, dtypes, and values (with configurable tolerance). Covers ufunc, linalg, FFT, polynomial, string, masked array, datetime, RNG, and I/O operations.
Metamorphic testing. Verifies algebraic identities that must hold regardless of input: a + b = b + a, a * 1 = a, sum(a) = sum(sort(a)), etc. 13 identities tested.
Adversarial fuzzing. Tests behavior on hostile inputs: NaN-filled arrays, extreme shapes (0-d, empty, very large), denormalized floats, integer overflow, and malformed NPY headers.
Witness stability. Hardcoded expected values for every RNG distribution ensure that code changes don't silently alter output sequences. When an algorithm is intentionally changed (e.g., porting Lemire's method), the witness values are regenerated from the new implementation.

Security Model

FrankenNumPy's security posture covers more than memory safety:

Zero unsafe Rust. All 9 crates declare #![forbid(unsafe_code)]. The 92,000+ lines of Rust contain zero unsafe blocks.
Fail-closed by default. Unknown wire formats, unrecognized dtype descriptors, and metadata schema violations cause explicit errors, not silent fallbacks.
Bounded resource consumption. NPY header parsing caps at 64 KB. NPZ archives cap at 4,096 members and 2 GB uncompressed. Memmap validation retries cap at 64. These prevent denial-of-service via crafted inputs.
Pickle rejection. Object dtype arrays that could execute arbitrary code during deserialization require explicit opt-in, matching NumPy's allow_pickle security gate.
Adversarial conformance. The security gate (run_security_gate) tests exploit scenarios from a versioned threat matrix mapped to specific parser/IO/shape-validation boundaries.

Algorithm Catalog

Each subsystem uses specific numerical algorithms. This section catalogs them.

FFT

Hybrid approach supporting arbitrary input lengths:

Power-of-two lengths: Cooley-Tukey decimation-in-time (DIT). Recursively splits into even/odd subsequences, applies butterfly operations with twiddle factors exp(-2*pi*i*k/N).
Non-power-of-two lengths: Bluestein's chirp-Z transform. Rewrites the DFT as a convolution, zero-pads to the next power of two, uses Cooley-Tukey for the convolution, then extracts the result.

All transforms (fft, ifft, fft2, ifft2, fftn, ifftn, rfft, irfft) build on these two primitives. Multi-dimensional transforms apply 1-D FFTs sequentially along each axis. fftfreq and rfftfreq compute the frequency bin centers; fftshift/ifftshift rearrange zero-frequency to the center.

Signal Processing

convolve(a, v): Direct O(n*m) convolution, full mode (output length n+m-1). Flips the kernel and slides it across the input.
correlate(a, v): Cross-correlation implemented as convolve(a, v[::-1]).
convolve2d / correlate2d: Full 2-D convolution with output shape (h1+h2-1, w1+w2-1).

Numerical Differentiation and Interpolation

gradient(f, *varargs) computes numerical derivatives with configurable edge handling:

Position	edge_order=1	edge_order=2
Interior	`(f[k+1] - f[k-1]) / 2h`	same (central difference)
First element	`(f[1] - f[0]) / h`	`(-3f[0] + 4f[1] - f[2]) / 2h`
Last element	`(f[n-1] - f[n-2]) / h`	`(3f[n-1] - 4f[n-2] + f[n-3]) / 2h`

Non-uniform spacing uses Lagrange polynomial coefficients (3-point at edges, 2-point for edge_order=1).

diff(a, n) computes n-th discrete difference: diff[i] = a[i+1] - a[i], applied n times. Output length decreases by n.

interp(x, xp, fp) does 1-D piecewise linear interpolation: binary search to find the enclosing interval in xp, then fp[lo] * (1-t) + fp[hi] * t where t = (x - xp[lo]) / (xp[hi] - xp[lo]). Values outside xp are clamped to the boundary values.

Windowing Functions

Five window types for spectral analysis, all returning [1.0] for M <= 1:

Window	Formula
Hamming	`0.54 - 0.46 * cos(2pii / (M-1))`
Hanning	`0.5 - 0.5 * cos(2pii / (M-1))`
Blackman	`0.42 - 0.5 * cos(2pii / (M-1)) + 0.08 * cos(4pii / (M-1))`
Bartlett	`1 -
Kaiser	`I0(beta * sqrt(1 - r^2)) / I0(beta)` (modified Bessel I0 via piecewise rational approx)

Histogram and Binning

histogram(a, bins) supports three automatic bin-count strategies:

Strategy	Formula
`"sturges"`	`ceil(log2(n)) + 1`
`"sqrt"`	`ceil(sqrt(n))`
`"auto"`	`max(sturges, sqrt)`

Bin edges are uniformly spaced: min + i * (max - min) / bins for i in 0..=bins. Element-to-bin assignment uses O(log bins) binary search per element. histogram_bin_edges returns just the edges without counting. histogramdd generalizes to N dimensions.

Padding

11 pad modes matching NumPy's np.pad:

Mode	Behavior
`constant`	Fill with a specified value (default 0)
`edge`	Replicate the edge value
`wrap`	Modular wrapping via `rem_euclid`
`reflect`	Mirror at edge, not duplicating the edge element
`symmetric`	Mirror at edge, duplicating the edge element
`linear_ramp`	Linear interpolation from edge to a ramp endpoint
`maximum`	Fill with the maximum of a window of the array edge
`minimum`	Fill with the minimum of a window
`mean`	Fill with the mean of a window
`median`	Fill with the median of a window
`empty`	Leave padded values uninitialized (filled with 0.0)

Financial Mathematics

Ten time-value-of-money functions using closed-form annuity algebra where possible:

Function	Method
`fv`, `pv`, `pmt`	Closed-form annuity factor `(1+rate)^nper`
`nper`	Logarithmic inversion of annuity formula
`npv`	Discounted cashflow sum `sum(cf[i] / (1+rate)^i)`
`irr`	Newton's method (max 100 iterations, tol 1e-12, initial guess 0.1)
`mirr`	Separates positive/negative cashflows, applies reinvestment/finance rates
`rate`	Newton's method on the annuity equation
`ipmt`, `ppmt`	Interest and principal portions derived from `fv` and `pmt`

Polynomial Systems

Five complete polynomial families, each with evaluation, arithmetic, calculus, and fitting:

Family	Basis	Key Operations
Power series	`x^n`	`polyval`, `polyder`, `polyint`, `polyfit`, `polymul`, `polyadd`, `polysub`, `polydiv`, `polyroots`
Chebyshev	`T_n(x)`	`chebval`, `chebadd`, `chebsub`, `chebmul`, `chebdiv`, `chebder`, `chebint`, `chebroots`, `chebfromroots`, `chebfit`, `cheb2poly`, `poly2cheb`
Legendre	`P_n(x)`	`legval`, `legder`, `legint`, `legfit`
Hermite (physicist)	`H_n(x)`	`hermval`, `hermder`, `hermint`
Hermite (probabilist)	`He_n(x)`	`hermeval`
Laguerre	`L_n(x)`	`lagval`, `lagder`, `lagint`

Chebyshev has the fullest support because its numerical conditioning makes it the recommended basis for most practical problems.

Masked Arrays

MaskedArray wraps a UFuncArray with an optional boolean mask and fill value:

MaskedArray { data: UFuncArray, mask: Option<UFuncArray>, fill_value: f64, hard_mask: bool }

Mask convention: 1.0 = masked (excluded from computation), 0.0 = valid. This matches NumPy's numpy.ma module. Operations that reduce masked arrays (sum, mean, min, max, var, std, median, ptp, argmin, argmax, cumsum, cumprod, count) skip masked values. Comparison and arithmetic operations propagate masks through mask_or. compressed() returns a 1-D array of only the unmasked values. filled(fill_value) replaces masked values with a fill value.

Datetime and Timedelta

DatetimeArray and TimedeltaArray support temporal arithmetic:

Datetime - Datetime = Timedelta
Datetime + Timedelta = Datetime
Timedelta + Timedelta = Timedelta
Scalar multiplication of Timedelta

Business day functions: busday_count(start, end) counts business days between dates, busday_offset(date, offset) adds business days to a date, is_busday(date) checks if a date is a business day. Weekday mask is Monday-Friday.

String Operations

33 numpy.char functions operating elementwise on string arrays: upper, lower, capitalize, title, center, ljust, rjust, zfill, strip, lstrip, rstrip, replace, find, rfind, count, startswith, endswith, isnumeric, isalpha, isdigit, isdecimal, str_len, encode, decode, translate, maketrans, partition, rpartition, split, rsplit, join, expandtabs, swapcase. String add concatenates elementwise; string multiply repeats.

Bit Packing

packbits(axis): Packs 8 boolean elements into 1 byte, MSB first. Output length = ceil(axis_len / 8).
unpackbits(axis, count): Unpacks bytes into boolean bits, MSB to LSB. Output length = axis_len * 8 (or count if specified).

Einsum Optimization

The einsum implementation supports three contraction strategies:

Strategy	Method	Best For
`"greedy"`	At each step, contract the pair with smallest intermediate result	Default, fast for any operand count
`"optimal"`	Dynamic programming over all contraction orderings	Optimal for <= 10 operands
`"auto"`	Selects greedy	Convenience alias

einsum_path returns the contraction order without executing, for inspection. einsum_optimized applies the selected strategy.

Artifact Durability (RaptorQ)

Every conformance artifact (fixture bundles, benchmark baselines, migration manifests) is protected by erasure-coding sidecars:

Encoding. Source data is hashed (SHA-256) and encoded into source + repair symbols using RaptorQ fountain codes. The sidecar records the codec parameters (symbol size, block count, repair overhead) alongside the encoded symbols in base64.

Scrubbing. A scrub report decodes all symbols, computes the SHA-256 of the decoded payload, and verifies it matches the source hash. It then drops one symbol and verifies that recovery from the remaining symbols still produces the correct hash.

Decode proof. An explicit artifact recording which symbol was dropped, how many repair symbols were needed to recover, and whether recovery succeeded. This provides machine-checkable evidence that the artifact can survive single-symbol loss.

The G8 CI gate (run_raptorq_gate.sh) enforces that all required bundles have valid sidecars, scrub reports with status: "ok", and decode proofs with recovery_success: true.

Threat Model

12 threat classes are formally mapped in security_control_checks_v1.yaml, each with assigned conformance suites, compatibility gates, and override audit policies:

Threat Class	What Could Go Wrong	Control
`malformed_shape`	Crafted dimensions cause OOB access or allocation bomb	Shape/stride suite + runtime policy adversarial suite
`unsafe_cast_path`	Silent data corruption through widening/narrowing cast	Dtype promotion suite with drift gate
`malicious_stride_alias`	Overlapping views cause data races or corruption	Shape/stride suite with alias drift gate
`malformed_npy_npz`	Malicious `.npy`/`.npz` file exploits parser bugs	IO adversarial suite + parser fail-closed gate
`unknown_metadata_version`	Future format version silently misinterpreted	Runtime policy suite + compatibility drift hash
`adversarial_fixture`	Hostile test inputs cause crash or panic	Adversarial suites across IO/RNG/linalg with reproducibility gate
`rng_reproducibility_drift`	Code change silently alters RNG output sequences	RNG differential + metamorphic + adversarial suites
`linalg_shape_tolerance_abuse`	Ill-conditioned matrix causes wrong result	Linalg differential + metamorphic suites
`linalg_backend_bridge_tampering`	Backend produces wrong result for well-conditioned input	Linalg adversarial + crash signature suites
`corrupt_durable_artifact`	Bit-rot or tampering in stored conformance artifacts	RaptorQ decode proof hash gate
`policy_override_abuse`	Unauthorized bypass of compatibility gates	Runtime policy adversarial + explicit audited override

Every threat log entry must include: fixture_id, seed, mode, env_fingerprint, artifact_refs, reason_code.

Shared Memory and Views

UFuncArrayView provides NumPy-style shared-memory views with overlap detection:

UFuncArrayView {
    shape: Vec<usize>,
    buffer: Arc<RwLock<Vec<f64>>>,  // shared backing store
    offset: isize,                   // byte offset into buffer
    strides: Vec<isize>,            // per-dimension strides (can be negative)
    writable: bool,
    dtype: DType,
}

Memory overlap detection uses a two-tier approach:

Fast path (may_share_memory): Checks whether the byte-offset spans of two views overlap. This is O(ndim) and conservative (may report false positives for non-contiguous views).
Exact path (shares_memory): First checks Arc pointer equality (different backing buffers never share). If same buffer, computes the actual set of accessed byte offsets (up to 200,000) and checks for intersection. Falls back to the fast path if offset collection exceeds the limit.

This supports safe in-place operations: if two views share memory, operations that read from one and write to the other must use temporary copies to avoid data corruption.

Float Error State Machine

FrankenNumPy replicates NumPy's floating-point error handling system:

┌───────────────┐   seterr(divide='raise')   ┌──────────────┐
│ Default       │ ────────────────────────── │ Custom       │
│ divide=Warn   │                            │ divide=Raise │
│ over=Warn     │   errstate(all='ignore')   │ over=Warn    │
│ under=Ignore  │ ────────────────────────── │ under=Ignore │
│ invalid=Warn  │   (RAII guard restores)    │ invalid=Warn │
└───────────────┘                            └──────────────┘

Six error modes per category: Ignore (suppress), Warn (log and continue), Raise (return error), Call (invoke user callback), Print (stderr), Log (append to event buffer).

Four error categories: divide-by-zero, overflow, underflow, invalid operation.

errstate() returns an RAII guard that automatically restores the previous error configuration when dropped, matching NumPy's context manager semantics:

let _guard = errstate(Some(FloatErrorMode::Ignore), None, None, None, None);
// all float errors ignored in this scope
// previous state restored when _guard drops

Scimath: Complex-Domain Extensions

The scimath module provides 8 functions that extend real-valued math to the complex domain for inputs outside the real function's natural domain:

Function	Real Domain	Extension
`scimath_sqrt(x)`	x >= 0	Returns complex sqrt for x < 0
`scimath_log(x)`	x > 0	Returns complex log for x <= 0
`scimath_log2(x)`	x > 0	Complex base-2 logarithm
`scimath_log10(x)`	x > 0	Complex base-10 logarithm
`scimath_power(x, p)`	x >= 0 (for non-integer p)	Complex result for negative base
`scimath_arccos(x)`	-1 <= x <= 1	Complex arccosine for \|x\| > 1
`scimath_arcsin(x)`	-1 <= x <= 1	Complex arcsine for \|x\| > 1
`scimath_arctanh(x)`	-1 < x < 1	Complex arctanh for \|x\| >= 1

These match numpy.lib.scimath and are useful in signal processing and physics where negative square roots or out-of-range inverse trig values arise naturally.

Complete Distribution List

All 49 random-generation methods available on Generator, grouped by family:

Continuous (28): beta, chisquare, exponential, f/f_distribution, gamma, gumbel, halfnormal, laplace, levy, logistic, lognormal, lomax, maxwell, noncentral_chisquare, noncentral_f, normal, pareto, power, rayleigh, standard_cauchy, standard_exponential, standard_gamma, standard_normal, standard_t, triangular, vonmises, wald, weibull

Discrete (7): binomial, geometric, hypergeometric, logseries, negative_binomial, poisson, zipf

Multivariate (3): dirichlet, multinomial, multivariate_normal

Uniform (3): random (float [0,1)), uniform (float [low,high)), integers (int [low,high))

Permutation (3): shuffle (in-place), permutation (copy), permuted (axis-aware)

Utility (2): bytes, choice/choice_weighted

State (3): spawn, jumped, state/set_state

Array Manipulation Toolkit

FrankenNumPy implements the full set of NumPy's array construction and manipulation functions.

Construction

Function	What it does
`zeros(shape)`	Array filled with 0.0
`ones(shape)`	Array filled with 1.0
`full(shape, val)`	Array filled with arbitrary value
`empty(shape)`	Uninitialized array (filled with 0.0 in practice)
`eye(n, m, k)`	Identity-like matrix with diagonal offset `k`
`identity(n)`	Square identity matrix (delegates to `eye`)
`diag(v, k)`	1-D input: construct diagonal matrix. 2-D input: extract diagonal.
`arange(start, stop, step)`	Evenly spaced values within interval
`linspace(start, stop, num)`	`num` evenly spaced values including endpoints
`logspace(start, stop, num)`	Values spaced evenly on log scale
`geomspace(start, stop, num)`	Values spaced evenly on geometric scale
`meshgrid(x, y, ...)`	Coordinate matrices from coordinate vectors (xy indexing)
`fromfunction(shape, f)`	Apply closure `f(&[usize]) -> f64` to each multi-index

Joining and Splitting

Function	Axis behavior
`concatenate(arrays, axis)`	Join along existing axis. All arrays must match on non-concat dims.
`stack(arrays, axis)`	Join along new axis. All arrays must have identical shape.
`vstack` / `row_stack`	Stack vertically (along axis 0). Promotes 1-D to (1, N).
`hstack`	Stack horizontally. For 1-D: axis 0. For N-D: axis 1.
`dstack`	Stack along axis 2. Promotes to at least 3-D first.
`column_stack`	Stack 1-D arrays as columns of a 2-D array.
`block(grid)`	Assemble from nested grid. Concatenates within rows, then stacks.
`split(ary, n, axis)`	Split into `n` equal sub-arrays along axis
`array_split(ary, n, axis)`	Split allowing unequal sub-arrays

Rearranging

Function	What it does
`transpose(axes)`	Permute dimensions
`moveaxis(src, dst)`	Move one axis to a new position
`rollaxis(axis, start)`	Roll axis backward until before `start`
`swapaxes(a1, a2)`	Swap two axes via permutation
`expand_dims(axis)`	Insert size-1 dimension
`squeeze(axis)`	Remove size-1 dimensions
`flip(axis)`	Reverse elements along axis
`fliplr` / `flipud`	Left-right / up-down reversal (requires ndim >= 2 / >= 1)
`rot90(k)`	Rotate by k*90 degrees on first two axes
`roll(shift, axis)`	Circular shift with wrapping
`tile(reps)`	Repeat array along each axis per `reps`
`repeat(n, axis)`	Repeat each element `n` times
`resize(new_shape)`	Resize with cyclic repetition if new shape is larger

Advanced Indexing

Function	What it does
`take(indices, axis)`	Select elements by integer indices (supports negative)
`put(indices, values)`	Replace flat-indexed elements (cyclic values)
`compress(condition, axis)`	Select elements where boolean condition is true
`extract(condition, arr)`	Flat extraction by boolean mask
`place(mask, vals)`	In-place replacement where mask is true (cyclic values)
`select(condlist, choicelist, default)`	Choose from multiple arrays by first-matching condition
`piecewise(condlist, funclist)`	Piecewise constant function via condition list
`take_along_axis(indices, axis)`	Gather values along axis by index array
`put_along_axis(indices, values, axis)`	Scatter values along axis by index array
`ravel_multi_index(coords, shape)`	Convert N-D coordinates to flat indices (C-order)
`unravel_index(indices, shape)`	Convert flat indices to N-D coordinates

Transfer Semantics (`fnp-iter`)

The iterator crate models NumPy's internal data transfer system, which decides how to move data between arrays during operations:

Transfer classes determine the copy strategy:

Class	When selected	Cost
`Contiguous`	Both src/dst have unit strides and matching alignment	Fastest (memcpy-like)
`Strided`	Arbitrary strides but lossless cast	Medium (per-element stride arithmetic)
`StridedCast`	Arbitrary strides with lossy cast	Slowest (per-element cast + stride)

Overlap detection decides copy direction:

Action	When	Why
`NoCopy`	Source and destination don't overlap	No precaution needed
`ForwardCopy`	Overlap, but forward iteration is safe	Dst starts after src start
`BackwardCopy`	Overlap, forward would corrupt	Must iterate in reverse

FlatIter indexing supports four modes: Single(i) for scalar access, Slice{start, stop, step} for regular ranges, Fancy(Vec<usize>) for arbitrary index arrays, and BoolMask(Vec<bool>) for boolean selection. The count_true_mask optimization processes boolean masks in 8-element chunks for vectorizable counting.

Phase2C Extraction Packets

The conformance system is organized around 9 extraction packets, each covering one domain of NumPy behavior:

Packet	Domain	Key Contracts
FNP-P2C-001	Shape/reshape	Element-count conservation, single -1 dimension, broadcast compatibility
FNP-P2C-002	Dtype/promotion	Promotion matrix determinism, safe-cast policy, dtype lifecycle
FNP-P2C-003	Strided transfer	Transfer-loop selection, cast pipeline, overlap handling, where-mask assignment
FNP-P2C-004	NDIter traversal	Iterator construction, multi-index seek, C/F tracking, external-loop mode
FNP-P2C-005	Ufunc dispatch	Signature parsing, method selection, override precedence, gufunc reduction
FNP-P2C-006	Stride tricks/broadcast	as_strided views, zero-stride propagation, writeability contracts
FNP-P2C-007	RNG contracts	Seed normalization, child-stream derivation, deterministic state, jump-ahead
FNP-P2C-008	Linalg bridge	Solver contracts, factorization modes, spectral operations, backend dispatch
FNP-P2C-009	NPY/NPZ IO	Magic/version validation, header-length bounds, pickle policy, truncated-data detection

Each packet produces 8 artifact files: legacy_anchor_map.md, contract_table.md, fixture_manifest.json, parity_gate.yaml, risk_note.md, parity_report.json, parity_report.raptorq.json, parity_report.decode_proof.json. The packet readiness validator checks all 8 files exist and contain required fields before a packet is marked ready.

RNG State Serialization

The random number generator supports full state capture and restoration for reproducibility:

// Capture state
let payload = generator.to_pickle_payload();

// Restore state
let restored = Generator::from_pickle_payload(payload)?;
// restored produces identical sequence from this point

GeneratorPicklePayload captures the bit-generator state (seed, counter, algorithm tag, schema version) and optionally the SeedSequence snapshot for spawn lineage tracking. The RandomState wrapper provides legacy compatibility with NumPy's older numpy.random.RandomState API.

Seed material accepts multiple forms: None (random), U64(seed), U32Words(vec), SeedSequence, or direct State { seed, counter } for exact state restoration. This matches NumPy's flexible seeding interface where default_rng(12345), default_rng([1, 2, 3]), and default_rng(SeedSequence(42)) all work.

Test Coverage

Crate	Tests	What it covers
`fnp-ufunc`	1,421	Array ops, math, sorting, polynomials, reductions, oracle tests, linalg bridge, FFT, masked cov/corrcoef, datetime parsing, gufunc validation
`fnp-linalg`	205	Decompositions, solvers, norms, 16 NumPy oracle tests, extreme-scale regression
`fnp-random`	191	40 oracle-verified distributions, seeding, reproducibility, large-n binomial/multinomial
`fnp-io`	154	NPY/NPZ read/write, text formats, compression, 7 format oracle tests
`fnp-dtype`	117	Dtype taxonomy, promotion table, cast policy, NumPy byte-width parsing
`fnp-conformance`	117	Differential parity, metamorphic identities, adversarial fuzzing, matmul conformance
`fnp-ndarray`	55	Shape legality, stride calculus, broadcast contracts, multi-axis negative strides
`fnp-iter`	103	Transfer semantics, NDIter traversal/broadcast/overlap, flatiter contracts, ndindex/ndenumerate
`fnp-runtime`	54	Mode split, fail-closed decoding, override-audit gate, Bayesian decision engine, evidence ledger
Total	2,426	All passing in workspace

Oracle Test Strategy

83 oracle tests verify bit-exact parity against NumPy:

40 RNG oracle tests: every distribution produces identical output from the same seed, verified against numpy.random.Generator(PCG64DXSM(12345))
20 ufunc edge-case tests: NaN propagation, empty arrays, Inf arithmetic, boolean dtype promotion, sort ordering
16 linalg oracle tests: det, inv, solve, eig, svd, cholesky, QR, norm, cond, slogdet, lstsq, rank
7 I/O format tests: NPY roundtrip, magic bytes, header dict format, 16-byte alignment

RNG Algorithm Parity

Every algorithm was ported from NumPy's C source code:

Algorithm	NumPy source	Our implementation
Bounded integers	`random_bounded_uint64()`	Lemire's method with 32/64-bit dispatch + buffered `next_uint32`
Binomial	`random_binomial_btpe()` + `_inversion()`	BTPE for large n*p, inversion for small
Hypergeometric	`hypergeometric_hrua()` + `_sample()`	HRUA (Stadlober 1989) + direct via `random_interval`
Poisson	`random_poisson_ptrs()` + `_mult()`	PTRS (Hormann 1993) for lam >= 10, multiplicative for small
Gamma	`random_standard_gamma()`	Marsaglia-Tsang + rejection for shape < 1 + exponential for shape = 1
Zipf	`random_zipf()`	Exact rejection with Umin clamping
Shuffle	`_shuffle_raw()`	Fisher-Yates via `random_interval()` masked rejection

Parity Status

Every feature family is parity_green:

Feature Family	Status
Shape/stride/view semantics	parity_green
Broadcasting legality	parity_green
Dtype promotion/casting	parity_green
Core math (ufunc)	parity_green
Reductions (NaN-propagating)	parity_green
Sorting (NaN-last)	parity_green
Set operations	parity_green
Indexing	parity_green
Polynomials (5 families)	parity_green
Statistics	parity_green
FFT	parity_green
String arrays	parity_green
Masked arrays	parity_green
Datetime/timedelta	parity_green
Linear algebra	parity_green
Random (39 statistical distributions)	parity_green
I/O (npy/npz)	parity_green
Financial	parity_green
Scimath	parity_green

See FEATURE_PARITY.md for the full matrix with evidence links.

CI Gate Topology

Eight ordered gates run from fast to heavy:

G1  fmt + lint           cargo fmt --check && cargo clippy -- -D warnings
G2  unit/property        cargo test --workspace --lib
G3  oracle differential  capture_numpy_oracle → run_ufunc_differential
G4  adversarial/security run_security_policy_gate.sh
G5  test/logging contract run_test_contract_gate.sh
G6  workflow e2e         run_workflow_scenario_gate.sh
G7  performance budget   run_performance_budget_gate.sh
G8  durability/decode    run_raptorq_gate.sh

Run all gates:

scripts/e2e/run_ci_gate_topology.sh

For manual direct rch workspace checks in this repo, prefer disabling Cargo incremental artifacts to avoid noisy remote artifact-retrieval races under target/debug/incremental:

CARGO_INCREMENTAL=0 rch exec -- cargo check --workspace --all-targets
CARGO_INCREMENTAL=0 rch exec -- cargo clippy --workspace --all-targets -- -D warnings

Conformance Pipeline

The oracle capture pipeline runs real NumPy, captures its output, and compares against our implementation:

# Recommended: require a real NumPy oracle and let the runner bootstrap
# a repo-local .venv-numpy314 with uv if needed.
FNP_REQUIRE_REAL_NUMPY_ORACLE=1 \
  cargo run -p fnp-conformance --bin capture_numpy_oracle

# Run differential comparison
cargo run -p fnp-conformance --bin run_ufunc_differential

If you want to manage the interpreter explicitly, this still works:

uv venv --python 3.14 .venv-numpy314
uv pip install --python .venv-numpy314/bin/python numpy
FNP_ORACLE_PYTHON="$(pwd)/.venv-numpy314/bin/python3" \
  cargo run -p fnp-conformance --bin capture_numpy_oracle

Environment Variable	Effect
`FNP_ORACLE_PYTHON`	Path to Python interpreter with NumPy. Explicit non-default paths win over repo-local bootstrap.
`FNP_REQUIRE_REAL_NUMPY_ORACLE=1`	Require a real NumPy oracle. If `FNP_ORACLE_PYTHON` is unset (or left at `python3`), `capture_numpy_oracle` bootstraps and reuses `.venv-numpy314` automatically, preferring `uv`, then standard `venv` + `pip`, and finally a user-site `pip install numpy` fallback when the worker lacks venv tooling.

Repository Layout

franken_numpy/
├── Cargo.toml                         # Workspace root (9 crates)
├── crates/
│   ├── fnp-dtype/                     # Dtype taxonomy, promotion table, cast policy
│   ├── fnp-ndarray/                   # Stride Calculus Engine (SCE)
│   ├── fnp-iter/                      # Transfer semantics, overlap-safe iteration
│   ├── fnp-ufunc/                     # 1,000+ array operations, reductions, einsum
│   ├── fnp-linalg/                    # solve, eig, svd, qr, cholesky, lstsq, etc.
│   ├── fnp-random/                    # 39 statistical distributions + generator helpers, PCG64DXSM, bit-exact parity
│   ├── fnp-io/                        # NPY/NPZ read/write, text I/O, DEFLATE
│   ├── fnp-conformance/               # Oracle capture, differential harness, benchmarks
│   └── fnp-runtime/                   # Strict/hardened mode, evidence ledger
├── legacy_numpy_code/numpy/           # Behavioral oracle (upstream NumPy source)
├── artifacts/                         # Contracts, security maps, logs, proofs
├── scripts/                           # E2E gate scripts
└── docs/                              # Specs and planning documents

Performance

FrankenNumPy prioritizes correctness over speed in this phase, but the architecture is designed for future optimization:

Release profile: opt-level = 3, LTO, single codegen unit, stripped symbols.
Contiguous reduction kernel: Axis reductions on contiguous data avoid per-element index computation. A targeted optimization pass reduced axis-reduction latency by ~56% (p50/p95/p99 deltas of ~90% on contiguous workloads).
Broadcast index mapping: Output-to-source index mapping uses incremental odometer updates instead of full unravel/remap per element.
2x2 fast paths: Linear algebra has specialized 2x2 implementations that bypass general NxN overhead for the most common small matrix case.
Horner's method: Polynomial evaluation and Stirling series use Horner's form for numerical stability and minimal multiplications.
Ziggurat sampling: Normal and exponential random variates use the Ziggurat method (same as NumPy), which accepts ~97% of samples on the first try.

Performance budgets are enforced by the G7 gate, which measures p50/p95/p99 latencies for ufunc and reduction sentinel workloads and rejects regressions.

Limitations

What works and what doesn't:

Not a Python package. FrankenNumPy is a Rust library. There is no pip install or Python FFI bridge yet. You cannot import frankennumpy from Python today.
No BLAS/LAPACK backend. Linear algebra uses pure-Rust implementations (Householder QR, Golub-Kahan SVD, implicit shifted QR for eigenvalues). Competitive with BLAS for small matrices; slower for large ones. Future BLAS linkage is planned.
Complex elementwise arithmetic uses interleaved storage. Complex64/Complex128 dtypes store real/imaginary parts as interleaved floats with a trailing dimension of 2. Elementwise multiply and divide apply true complex arithmetic (a+bi)(c+di) = (ac-bd)+(ad+bc)i, but the interleaved representation adds overhead compared to native complex types.
multivariate_normal uses Cholesky. NumPy defaults to SVD. Adding SVD would require fnp-linalg as a dependency of fnp-random (currently zero-dependency).
multivariate_hypergeometric uses sequential draws. NumPy uses the random_mvhg_marginals algorithm.
frompyfunc and nditer require Python callable protocol (N/A for Rust).
Single-threaded. All operations are single-threaded. The asupersync async runtime integration is optional and used only for conformance pipeline orchestration, not for parallel array computation.
f64 internal representation. UFuncArray stores numeric values as Vec<f64> internally for arithmetic. For i64/u64 values > 2^53, an IntegerSidecar preserves exact integer values through storage round-trips (from_storage / to_storage). Arithmetic on large integers still uses f64 approximation.

FAQ

Is this a drop-in replacement for NumPy? Not yet. It reimplements NumPy's semantics in Rust for correctness verification and eventual use as a Rust-native array library. A Python FFI bridge is planned but not implemented.

How do you verify parity with NumPy? Oracle tests: we run the same operations with the same inputs in both NumPy and FrankenNumPy, comparing outputs to floating-point tolerance. For RNG, the comparison is bit-exact.

Why Rust nightly? We use Rust Edition 2024 features. The toolchain is pinned to a specific nightly date for reproducibility.

Why zero unsafe code? Memory safety is a core value. All 9 crates declare #![forbid(unsafe_code)]. The 92,000+ lines of Rust contain zero unsafe blocks.

How fast is it? Performance is not the primary goal in this phase. Correctness and parity come first. That said, the release profile uses opt-level = 3, LTO, and single codegen unit.

Can I use just the RNG crate? Yes. fnp-random has zero dependencies and produces bit-exact NumPy-compatible random sequences from a given seed.

About Contributions

Please don't take this the wrong way, but I do not accept outside contributions for any of my projects. I simply don't have the mental bandwidth to review anything, and it's my name on the thing, so I'm responsible for any problems it causes; thus, the risk-reward is highly asymmetric from my perspective. I'd also have to worry about other "stakeholders," which seems unwise for tools I mostly make for myself for free. Feel free to submit issues, and even PRs if you want to illustrate a proposed fix, but know I won't merge them directly. Instead, I'll have Claude or Codex review submissions via gh and independently decide whether and how to address them. Bug reports in particular are welcome. Sorry if this offends, but I want to avoid wasted time and hurt feelings. I understand this isn't in sync with the prevailing open-source ethos that seeks community contributions, but it's the only way I can move at this velocity and keep my sanity.

License

MIT License (with OpenAI/Anthropic Rider). See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 302 Commits
.github/workflows		.github/workflows
artifacts		artifacts
crates		crates
legacy_numpy_code/numpy		legacy_numpy_code/numpy
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
COMPREHENSIVE_SPEC_FOR_FRANKENNUMPY_V1.md		COMPREHENSIVE_SPEC_FOR_FRANKENNUMPY_V1.md
COMPREHENSIVE_SPEC_FOR_FRANKENSQLITE_V1.md		COMPREHENSIVE_SPEC_FOR_FRANKENSQLITE_V1.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DESTRUCTIVE_OPS_LOG.md		DESTRUCTIVE_OPS_LOG.md
EXHAUSTIVE_LEGACY_ANALYSIS.md		EXHAUSTIVE_LEGACY_ANALYSIS.md
EXISTING_NUMPY_STRUCTURE.md		EXISTING_NUMPY_STRUCTURE.md
FEATURE_PARITY.md		FEATURE_PARITY.md
LICENSE		LICENSE
PARAMETER_PARITY_TODO.md		PARAMETER_PARITY_TODO.md
PHASE2C_EXTRACTION_PACKET.md		PHASE2C_EXTRACTION_PACKET.md
PLAN_TO_PORT_NUMPY_TO_RUST.md		PLAN_TO_PORT_NUMPY_TO_RUST.md
PROPOSED_ARCHITECTURE.md		PROPOSED_ARCHITECTURE.md
README.md		README.md
TODO_GRANULAR_EXECUTION.md		TODO_GRANULAR_EXECUTION.md
UPGRADE_LOG.md		UPGRADE_LOG.md
franken_numpy_illustration.webp		franken_numpy_illustration.webp
get_true_numpy_values.py		get_true_numpy_values.py
gh_og_share_image.png		gh_og_share_image.png
patch_conformance.py		patch_conformance.py
patch_continuous.py		patch_continuous.py
patch_remaining.py		patch_remaining.py
rust-toolchain.toml		rust-toolchain.toml
test_parity.py		test_parity.py
update_conformance.py		update_conformance.py
update_exact.py		update_exact.py

Folders and files

Latest commit

History

Repository files navigation

FrankenNumPy

The Problem

The Solution

Why FrankenNumPy?

Quick Example

Design Philosophy

API Surface

Installation

Architecture

How It Works: Deep Dive

Dtype System (fnp-dtype)

Stride Calculus Engine (fnp-ndarray)

Dual-Mode Runtime (fnp-runtime)

Ufunc Dispatch (fnp-ufunc)

Random Number Generation (fnp-random)

Linear Algebra (fnp-linalg)

I/O Format Handling (fnp-io)

Conformance Infrastructure (fnp-conformance)

Security Model

Algorithm Catalog

FFT

Signal Processing

Numerical Differentiation and Interpolation

Windowing Functions

Histogram and Binning

Padding

Financial Mathematics

Polynomial Systems

Masked Arrays

Datetime and Timedelta

String Operations

Bit Packing

Einsum Optimization

Artifact Durability (RaptorQ)

Threat Model

Shared Memory and Views

Float Error State Machine

Scimath: Complex-Domain Extensions

Complete Distribution List

Array Manipulation Toolkit

Construction

Joining and Splitting

Rearranging

Advanced Indexing

Transfer Semantics (fnp-iter)

Phase2C Extraction Packets

RNG State Serialization

Test Coverage

Oracle Test Strategy

RNG Algorithm Parity

Parity Status

CI Gate Topology

Conformance Pipeline

Repository Layout

Performance

Limitations

FAQ

About Contributions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Dtype System (`fnp-dtype`)

Stride Calculus Engine (`fnp-ndarray`)

Dual-Mode Runtime (`fnp-runtime`)

Ufunc Dispatch (`fnp-ufunc`)

Random Number Generation (`fnp-random`)

Linear Algebra (`fnp-linalg`)

I/O Format Handling (`fnp-io`)

Conformance Infrastructure (`fnp-conformance`)

Transfer Semantics (`fnp-iter`)

Packages