Comparing changes

* perf: remove dual transition storage — State.transitions eliminated Remove transitions []StateID and transitionCount from State struct. Transitions now stored exclusively in DFACache.flatTrans flat table. - Remove State.AddTransition(), Transition(), Stride(), TransitionCount() - Remove Builder.move() (unused after DetectAcceleration simplification) - Simplify DetectAcceleration/DetectAccelerationFromCached to return nil - Add DetectAccelerationFromFlat() reading from flat table - Simplify tryDetectAccelerationWithCache (flatTrans-only path) - Remove 3 redundant AddTransition calls from determinize - Update tests: add TestDetectAccelerationFromFlat, remove State transition tests Memory: ~222MB -> ~150MB (eliminates redundant per-state transition slices) * perf: Rust-aligned BT visited limit for UseNFA — 72% less memory Add NewBoundedBacktrackerSmall() with 128K entries (256KB) visited capacity, matching Rust regex's default visited_capacity. UseNFA path now creates BT with small limit. When haystack exceeds BT capacity, falls back to PikeVM (correct for leftmost-first). UseBoundedBacktracker strategy retains 32M limit for POSIX longest-match. LangArena LogParser (7MB log, 13 patterns, 10 iterations): - Total alloc: 89MB -> 25MB (-72%) - RSS (Sys): 353MB -> 41MB (-88%) - errors pattern: 66MB -> 2.4MB (-96%) - Speed: no regression (113-126ms per iter) * perf: byte-based DFA cache limit — 2MB default like Rust Replace MaxStates (count) with CacheCapacityBytes (bytes). Default: 2MB matching Rust regex's hybrid_cache_capacity. - Add DFACache.MemoryUsage() (mirrors Rust Cache::memory_usage) - Insert checks MemoryUsage() >= capacityBytes instead of state count - Config: CacheCapacityBytes (new), MaxStates (deprecated, backward compat) - Self-adjusting: fewer states for large stride, more for small - effectiveCapacityBytes() bridges legacy MaxStates to bytes (~100B/state) * wip: SlotTable-based capture search — greedy loop capture bug SearchWithSlotTableCapturesAt now uses SlotTable instead of legacy COW. Works for simple patterns like (foo)(bar), but greedy repetitions (a+)(b+) lose group start positions during loop iterations. Root cause: addSearchThread CopySlots overwrites capture slots on each loop iteration. Need stack-based epsilon closure with RestoreCapture frames (Rust approach) to preserve capture context through loops. TODO: Convert recursive addSearchThread to stack-based with save/restore Status: 2 NFA unit test failures, all meta tests pass (meta still on COW) * wip: stack-based epsilon closure with RestoreCapture Converted addSearchThread and addSearchThreadToNext from recursive to stack-based with captureFrame (Explore + RestoreCapture frames). Mirrors Rust pikevm.rs FollowEpsilon::RestoreCapture pattern. Still failing: greedy loop captures (a+)(b+) — per-state SlotTable overwrites group start on each loop iteration (State visited again in next generation). Per-thread COW preserves all variants. Root issue: per-state storage loses capture history across byte transitions in greedy loops. Need either per-thread indexing or generation-aware slot preservation. Status: 2 NFA unit tests fail, all meta tests pass * feat: dual SlotTable capture tracking — zero-alloc FindSubmatch Implement Rust-style dual SlotTable (curr/next) for capture propagation across byte transitions. Stack-based epsilon closure with RestoreCapture frames preserves capture context through greedy loops. Key changes: - Add NextSlotTable + captureStack + currSlots to PikeVMState - addSearchThread: stack-based with captureFrame (Explore + RestoreCapture) - addSearchThreadToNext: loads from curr SlotTable, writes to next - Swap SlotTable/NextSlotTable after each byte (Rust mem::swap pattern) - Don't clear Visited before seed — prevents SlotTable row overwrite - Wire meta FindSubmatch to use SlotTable path - Fix empty match capture groups (buildCapturesFromSlots) FindAllSubmatch (5 patterns, 50K matches, 800KB input): - Alloc: 554MB -> 26MB (-95%) - Mallocs: 12.5M -> 440K (-96%) - Time: 1.48s -> 0.45s (3.3x faster) * docs: update CHANGELOG, OPTIMIZATIONS, add ARCHITECTURE.md for v0.12.19 - CHANGELOG: add SlotTable capture tracking entry - OPTIMIZATIONS: add #10 Dual SlotTable (95% less memory), update version - ARCHITECTURE.md: new file documenting engine architecture, memory model, thread safety, and Rust alignment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on Mar 24, 2026

This comparison is taking too long to generate.

Uh oh!