Skip to content

mgillr/crdt-merge

title crdt-merge
colorFrom gray
colorTo gray
sdk gradio
sdk_version 5.50.0
python_version 3.12
app_file app.py
pinned true
license other
license_name BUSL-1.1
license_link https://github.com/mgillr/crdt-merge/blob/main/LICENSE
tags
crdt
merge
model-merging
distributed
convergence
neural-network
short_description Mathematically guaranteed convergent model and data merge

crdt-merge

The first merge library where every operation is mathematically guaranteed to converge.
Tabular data. Neural network weights. Distributed agents. One unified CRDT layer.

PyPI version Downloads Python 3.9+ Tests CRDT Compliance License: BSL 1.1

🤗 Live Demo 🤗 Data Merge 🤗 Federation

pip install crdt-merge

Documentation · Quick Start · API Reference · Architecture · Changelog


Every merge algorithm you use is broken. This one isn't.

Every standard merge strategy — weight averaging, SLERP, TIES, DARE, Fisher — fails at least one of the three algebraic laws required for distributed convergence. This isn't an implementation bug. It's mathematically provable. crdt-merge is the fix: a patented two-layer architecture that makes any merge strategy — including inherently stochastic and non-commutative ones — fully CRDT-compliant.

The result: 26 strategies, all guaranteed to produce identical output regardless of merge order, grouping, or duplication. No coordination. No locking. No central arbiter.

Patent Pending — UK Application No. 2607132.4

How it works — full architecture, mathematical proofs, and the 7 architectures we evaluated →


What you can do with it

Federated Model Merging Without a Parameter Server

100 hospitals train locally, merge globally — no coordinator, no single point of failure. Any node can produce the final model. Late arrivals are absorbed automatically. Guide →

Convergent Multi-Agent AI

Agents share and merge beliefs via CRDT state — no orchestrator picks winners. Works offline, across partitions, at any scale. Guide →

Privacy-Preserving Merge

Merge encrypted data without decryption. Four AEAD backends. The merging party never sees plaintext. Guide →

The Right to Forget in Trained Models

GDPR erasure in milliseconds — surgically remove a contributor's influence without retraining. Guide →

MergeQL — Query Language for Distributed Knowledge

SQL-like syntax for CRDT-correct multi-source merges with full provenance. Guide →

Provenance-Complete AI

Per-field, per-decision audit trail — SHA-256 hash-chained, tamper-evident, EU AI Act ready. Guide →

LoRA Adapter Merging

Mixed-rank adapters merged with per-module strategy selection and SVD rank harmonization. Guide →

Continual Learning Without Catastrophic Forgetting

Absorb new tasks as post-training merges — no data replay, no model growth, full knowledge retention. Guide →

Gossip Protocol Sync

Serverless state convergence via digest-based anti-entropy. You provide the transport. Guide →

Probabilistic CRDTs at Planetary Scale

HyperLogLog, Bloom filters, Count-Min Sketch — all natively CRDT-mergeable across 500+ nodes. Guide →

Delta Sync & Merkle Verification

Ship only what changed. Prove convergence in O(log n). Guide →

Runtime CRDT Verification

Property-based proof that any merge function satisfies all three laws. Catches violations at import time. Guide →

Agentic Memory at Scale

O(1) dedup across 1M+ agent memories. Budget-aware context merge. Crash recovery from peers. Guide →


Quick Start

# Tabular
from crdt_merge import CRDTDataFrame
merged = CRDTDataFrame(df_a, node_id="a").merge(CRDTDataFrame(df_b, node_id="b"))

# Model weights
from crdt_merge.model import CRDTMergeState
state = CRDTMergeState()
state.add("model-a", weights_a)
state.add("model-b", weights_b)
merged = state.merge(strategy="slerp")  # order of add() never matters

# Verify any merge function
from crdt_merge import verified_merge

@verified_merge
def my_merge(a, b):
    return your_logic(a, b)  # raises CRDTViolationError if laws are broken

Full API reference →


Installation

pip install crdt-merge            # Core — zero dependencies
pip install crdt-merge[fast]      # DuckDB + Polars (38.8× on A100)
pip install crdt-merge[model]     # PyTorch model weights
pip install crdt-merge[crypto]    # AEAD encryption backends
pip install crdt-merge[all]       # Everything

Zero required dependencies. Python 3.9–3.12. Linux, macOS, Windows.

All install options →


By the Numbers

Test suite 3,041 tests, 0 failures
CRDT compliance tests 1,200 / 1,200
Merge strategies 26
CRDT overhead < 0.5ms per merge
Model speedup vs. naive 38.8×
Encryption backends 4
Architectures evaluated 7 → 1 winner

Cross-Language Ports

Language Package Status
Python (reference) crdt-merge v0.9.4 Full feature set
Rust crdt-merge v0.2.0 Core CRDTs + merge
TypeScript crdt-merge v0.2.0 Core CRDTs + merge
Java crdt-merge v0.2.0 Source complete

License

BSL 1.1 → automatically converts to Apache 2.0 on 29 March 2028.

Free for research, personal use, and most production use. Source-available. Not free for competing commercial merge-as-a-service.

The PATENTS file includes a defensive patent grant (UK Application 2607132.4). See LICENSE, CLA.

Copyright 2026 Ryan Gillespie. Commercial licensing: [email protected] · [email protected]


About

Conflict-free merge for DataFrames, JSON, ML models & distributed agents — powered by CRDTs. The first merge library where every operation is mathematically guaranteed to converge.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors