Fast univariate online time series models. Zero dependencies. Runs in Pyodide.
pip install skatersfrom skaters import skater
f = skater(k=3)
state = None
for y in observations:
dists, state = f(y, state)
dists[0].mean # point forecast
dists[0].std # uncertainty
dists[0].quantile(0.975) # 95th percentile
dists[0].logpdf(y) # log-likelihood
dists[0].cdf(y) # CDF at yEvery skater returns list[Dist] — a weighted Gaussian mixture for each horizon
Every named function builds a Bayesian ensemble over the same full candidate population. The names represent different search strategies — different priors, learning rates, and complexity penalties — not different models.
from skaters import holt, hosking, laplace, samuelson, wald, dantzig
f = holt(k=1) # expect trends (Holt 1957)
f = hosking(k=1) # expect long memory (Hosking 1981)
f = laplace(k=1) # no opinion — let the data decide
f = samuelson(k=1) # there's a drift, find it carefully (Samuelson 1965)
f = wald(k=1) # minimax caution (Wald)
f = dantzig(k=1) # optimize under compute constraints (Dantzig 1947)| Policy | After | Prior | Best for | ||
|---|---|---|---|---|---|
holt |
Holt 1957 | Differencing + Holt linear | 0.50 | 0.02 | Trending data |
hosking |
Hosking 1981 | Fractional differencing | 0.50 | 0.01 | Long memory |
laplace |
Laplace | Uniform | 0.80 | 0.005 | General purpose (recommended default) |
samuelson |
Samuelson 1965 | Drift + Holt | 0.40 | 0.01 | Persistent drift (GDP, prices) |
wald |
Wald | Depth 0 | 0.15 | 0.08 | Adversarial, non-stationary |
dantzig |
Dantzig 1947 | Adaptive search | 0.30 | 0.01 | Adaptive (grows pool online) |
Or tune directly:
from skaters import skater
f = skater(k=3, aggressiveness=0.9) # fast adapter
f = skater(k=3, aggressiveness=0.1) # conservativeEverything is transforms all the way down, with a distributional leaf at the bottom:
The leaf estimates
Every node returns list[Dist]. There is no separate "point forecast" vs "uncertainty" — both are aspects of the same
Every "model" is really a transform. An EMA doesn't "predict" — it subtracts a running level
A weighted mixture of Gaussians math.erf, math.exp).
from skaters import Dist
d = Dist.gaussian(5.0, 2.0)
d.mean # 5.0
d.std # 2.0
d.pdf(5.0) # density at x
d.cdf(3.0) # P(X <= 3)
d.logpdf(5.0) # log-likelihood
d.quantile(0.975) # inverse CDF
# Exact mixture combination (for ensembles)
mix = Dist.combine([d1, d2, d3], weights=[0.5, 0.3, 0.2])
# Propagate through transform inverses
d.shift(10.0) # translate: mu -> mu + 10
d.scale(2.0) # scale: mu -> 2*mu, sigma -> 2*sigma
d.affine(2.0, 3.0) # x -> 2x + 3
# Bound component growth
d.prune(max_components=10)Online bijective maps. Each has a forward (scalar in, scalar out) and an inverse_k that propagates
| Transform | Forward | Inverse | Use case |
|---|---|---|---|
ema_transform()
|
Remove level | ||
difference() |
$y't = y_t - y{t-1}$ | Cumsum with |
Random walk |
drift()
|
Random walk + drift | ||
holt_linear()
|
Level + trend (Holt 1957) | ||
ar()
|
$y't = y_t - \sum \hat\phi_j y{t-j}$ | AR reconstruction with variance propagation | Autoregression (online RLS) |
grouped_ar()
|
Same, grouped coefficients | Same | Long-lag AR with |
fractional_difference()
|
Long memory | ||
standardize()
|
Remove scale | ||
garch()
|
Volatility clustering | ||
seasonal_difference()
|
$y't = y_t - y{t-s}$ | Shift by lagged value | Periodicity |
power_transform()
|
Delta method | Tail compression |
Transforms compose via conjugation. Given a transform
The pipe | notation reads left-to-right (outermost transform first):
from skaters import conjugate, ema, difference, standardize
# diff removes trend, EMA predicts the differenced series
f = conjugate(ema(alpha=0.1, k=3), difference(), k=3)
# Chain: standardize, then difference, then EMA
f = conjugate(
conjugate(ema(alpha=0.1, k=3), difference(), k=3),
standardize(),
k=3,
)
# canonical name: std|diff|ema_t|leafWeights by
from skaters import precision_weighted_ensemble, ema
f = precision_weighted_ensemble([
ema(alpha=0.05, k=1),
ema(alpha=0.2, k=1),
], k=1)Each model
where
from skaters import bayesian_ensemble, ema
f = bayesian_ensemble(
[ema(alpha=0.05, k=1), ema(alpha=0.2, k=1)],
k=1,
learning_rate=0.5, # eta: prevents over-concentrating
complexity_penalty=0.02, # lambda: penalizes deeper chains
depths=[1, 1],
)Grows the candidate population online: expand top performers with new transforms, replay recent history to warm-start, prune losers.
from skaters import search
f = search(
k=1,
expand_interval=100, # expand top performers every 100 obs
max_depth=3, # maximum transform chain depth
replay_buffer=500, # warm-start new candidates on recent history
max_pool=30, # cap active candidates
)Serialize and rebuild any pipeline:
from skaters import (
build, spec_name, to_json, from_json,
ema_spec, conjugate_spec, ensemble_spec, diff_spec,
)
spec = ensemble_spec(
conjugate_spec(ema_spec(0.1, k=1), diff_spec()),
ema_spec(0.3, k=1),
k=1,
)
spec_name(spec) # "ensemble(diff|ema(0.1),ema(0.3))"
j = to_json(spec) # JSON string
f = build(from_json(j)) # live skaterAny forward is scalar and inverse_k maps list[Dist]:
def my_transform():
def forward(y, state):
if state is None:
return 0.0, {"anchor": y}
transformed = y - state["anchor"]
return transformed, {"anchor": y}
def inverse_k(dists, state):
return [d.shift(state["anchor"]) for d in dists]
return forward, inverse_k-
Online only —
$O(1)$ per observation, no batch recomputation -
Distributional — every prediction is a
$\text{Dist}$ , not a point estimate -
Composable — transforms chain, ensembles nest, everything returns
$\text{Dist}$ -
Pure Python — zero dependencies, only
math.erfandmath.exp - Pyodide compatible — works in the browser via WebAssembly
This package distills ideas from timemachines, which provided a common skater interface for dozens of time series packages. This is a from-scratch rewrite focused on speed, distributional predictions, and browser compatibility.