musecl-memory

Portable persistence for AI agent memory, using git.

Your agent learned 500 things last month. Then the disk died. Now it knows nothing.

AI agents accumulate knowledge -- corrections, preferences, patterns, domain expertise. But that knowledge is trapped on one machine. If the disk fills, the VM recycles, or you move to new hardware, all accumulated intelligence is gone.

musecl-memory fixes this by applying a pattern every developer already knows:

package.json syncs. node_modules rebuilds locally.

Memory source files (Markdown, JSON) sync via git. Vector indexes rebuild locally from those files. The repo stays small, portable, and free of binary bloat.

Traditional Software                    AI Agent Memory
======================                  ======================
package.json         <--->              MEMORY.md (source)
package-lock.json    <--->              patterns.json (source)
node_modules/        <--->              vectordb/, *.sqlite (derived)
npm install          <--->              agent memory index --force
git push/pull        <--->              sync.sh push/pull

Why This Pattern?

Current approaches have a portability gap:

Approach	Portable?	Private?	Free?	Auditable?
Cloud memory APIs	Yes	No	No	No
Local-only memory systems	No	Yes	Yes	Partially
Protocol-based memory services	No	Yes	Yes	Yes
musecl-memory	Yes	Yes	Yes	Yes

Architecture

Single Machine

  Git Repository (private)        Local Clone            Agent Workspaces
 +---------------------+    +---------------------+    +---------------------+
 |                     |    |                     |    |                     |
 |  agent-a/           |    |  ~/memory-sync/     |    |  ~/agent-a/memory/  |
 |    MEMORY.md        |<-->|    (git push/pull)  |<-->|  ~/agent-b/memory/  |
 |  agent-b/           |    |                     |    |                     |
 |    MEMORY.md        |    +---------------------+    +---------------------+
 |  .gitignore         |                                        |
 +---------------------+                                   index (rebuild)
                                                                v
                                                       +---------------------+
                                                       |  Vector Indexes     |
                                                       |  (rebuilt locally,  |
                                                       |   NOT synced)       |
                                                       +---------------------+

Multiple Machines

+------------------+          +------------------+          +------------------+
|                  |  push    |                  |   pull   |                  |
|  Agent Memory    |--------->|  Git Repository  |--------->|  Agent Memory    |
|  (Machine A)     |          |  (git remote)    |          |  (Machine B)     |
|                  |<---------|                  |<---------|                  |
|  .md .json files |   pull   |  .md .json files |   push   |  .md .json files |
+------------------+          +------------------+          +------------------+
        |                                                           |
   index (local)                                               index (local)
        v                                                           v
+------------------+                                        +------------------+
|  Vector Index    |                                        |  Vector Index    |
|  (NOT synced)    |                                        |  (NOT synced)    |
+------------------+                                        +------------------+

Each machine rebuilds its own indexes from the shared source files. No binary conflicts, no embedding model version mismatches.

How It Works

Push (local -> git)

  [Agent Workspace]                    [Sync Repo]               [Git Remote]
        |                                  |                        |
        |-- content hash compare -------->|                        |
        |   (skip unchanged)              |                        |
        |-- copy changed files ---------->|                        |
        |                                 |-- sanitize (secrets)   |
        |                                 |-- git add + commit     |
        |                                 |-- git push ---------->|

Compare workspace files against sync repo by content hash (not timestamp)
Copy only changed .md and .json files
Scan for secrets -- abort if API keys or tokens are found
Commit and push

Pull (git -> local)

  [Git Remote]               [Sync Repo]                    [Agent Workspace]
     |                           |                                  |
     |-- git pull (ff-only) --->|                                  |
     |                          |-- copy files ------------------>|
     |                          |   (chmod 600)                   |
     |                          |                                  |-- rebuild index

git pull --ff-only -- fails loudly if diverged (protects local changes)
Copy files to agent workspaces with 600 permissions
Rebuild vector indexes for agents that received changes

Status

Bidirectional drift check:

Files in workspace but not in sync repo (new local knowledge, needs push)
Files in sync repo but not in workspace (missing restore, needs pull)

What Gets Synced

Synced (git-tracked)	Not Synced (gitignored)
`*.md` -- Memory documents	`*.sqlite` -- Vector indexes
`*.json` -- Structured data	`vectordb/` -- Embedding stores
`sync.sh` -- The sync script	`*.jsonl` -- Session transcripts
`.gitignore`	`.key`, `.pem`, `*.env` -- Secrets

Design Principles

Source files are the truth

Memory is plain text -- human-readable, diffable, version-controlled. Binary indexes are derived artifacts, rebuilt locally, never committed.

Per-agent isolation

Each agent gets its own directory. Security knowledge doesn't pollute architecture patterns. This maps naturally to how multi-agent systems partition responsibilities.

sync-repo/
  agent-alpha/
    MEMORY.md
    patterns.md
    daily-log.json
  agent-beta/
    MEMORY.md
    domain-notes.md
  agent-gamma/
    MEMORY.md

Secret sanitization

Memory files can inadvertently capture API keys from agent conversations. The sync script scans for known patterns before every commit and aborts if any are found.

Fail-safe operations

Pulls use --ff-only to prevent silent merge conflicts. Content is compared by hash, not timestamp, so changes are detected reliably across machines. Drift checks run in both directions to catch files that exist on one side but not the other.

Reference Implementation

A complete sync script with push, pull, and status: sync.sh

Customize AGENTS and AGENTS_DIR for your setup, then:

./sync.sh push      # Copy local memory to git and push
./sync.sh pull      # Pull from git and restore to workspaces
./sync.sh status    # Show per-agent drift

Scaling Up

The pattern starts simple and grows with your needs.

Level 1: Automated Sync

# Cron: push every 6 hours
0 */6 * * * cd ~/memory-sync && bash sync.sh push

Level 2: Multi-Machine

For agents running on multiple machines simultaneously:

Strategy	Tradeoff
Branch-per-machine	Clean history, periodic merge needed
Append-only logs	No conflicts (datestamped files), larger repo
Last-writer-wins	Simplest, works if sessions don't overlap

Level 3: Encrypted Sync

For sensitive memory content:

# Encrypt before push
gpg --symmetric --cipher-algo AES256 -o "$dest.gpg" "$file"
# Decrypt after pull
gpg --decrypt -o "$dest" "$file.gpg"

Level 4: Cloud Hybrid

The pattern extends to cloud without losing its core properties. Git stays the source of truth; cloud services are consumers of it.

                    +---------------------+
                    |  Cloud Services     |
                    |  (optional layer)   |
                    |                     |
                    |  - Object storage   |
                    |  - Managed vector DB|
                    |  - Webhook triggers |
                    +---------------------+
                              ^
                              | deploy (CI/CD)
                              |
                    +---------------------+
                    |  Git Repository     |  <-- still the source of truth
                    |  (private)          |
                    +---------------------+
                          |       ^
                     pull |       | push
                          v       |
                    +---------------------+
                    |  Local Workspaces   |
                    +---------------------+

Cloud extension patterns:

CI/CD index rebuild -- Push triggers a pipeline that rebuilds vector indexes and deploys them to a managed database. Any machine queries the cloud index instead of rebuilding locally.
Object storage mirror -- Mirror the repo for serverless agents that need read-only access without git.
Webhook notifications -- Push triggers a webhook that tells running agents to refresh their context, eliminating sync lag.
Shared memory API -- Wrap the repo in a lightweight HTTP endpoint. Agents query memories without needing local file access.

When to Use What

Scenario	Recommendation
Single machine, personal use	Git only
2-3 machines, same person	Git pull on each machine
Team of agents across infra	Cloud hybrid
Serverless / ephemeral agents	Cloud hybrid
Air-gapped / offline	Git bundles for transfer

Cloud adds convenience and scale. It doesn't improve privacy, auditability, or cost -- it trades those. Start local, add cloud when you outgrow it.

Enhanced Experience with OpenClaw

OpenClaw is an open-source AI agent framework that natively manages per-agent memory workspaces, vector indexing, and multi-agent orchestration. It provides the local runtime that musecl-memory complements.

musecl-memory                        OpenClaw
===============                        ========
Portable persistence (git)     +       Local agent runtime
Push/pull memory files         +       Auto-capture/auto-recall
Secret sanitization            +       Per-agent workspace isolation
Cross-machine sync             +       Vector indexing and search

OpenClaw manages the local memory lifecycle -- auto-capturing knowledge from conversations and auto-recalling relevant context each turn. musecl-memory adds portability: syncing those files to git so they survive machine changes, disk failures, or infrastructure migrations.

# OpenClaw agents generate memory files during sessions
# (auto-capture writes to ~/agent-workspace/memory/)

# Sync to git
sync.sh push

# On a new machine, restore everything
sync.sh pull
openclaw memory index --force --agent alpha

# Agent immediately has all accumulated knowledge

Without OpenClaw	With OpenClaw
Manual memory file management	Auto-capture from conversations
Manual index rebuilds	Built-in `openclaw memory index`
Single agent per workspace	Multi-agent with isolated workspaces
File-based recall only	Semantic vector search across memories
Push/pull only	Auto-recall injects context per turn

The pattern works with any agent framework. OpenClaw removes the manual steps.

FAQ

Why git and not object storage or file sync services? Git gives you versioning, diffing, branching, and collaboration for free. You can see exactly what your agent learned and when. Try doing git log --oneline on a synced folder.

Why not a cloud memory API? Cloud APIs are great for managed infrastructure. This pattern is for people who want to own their agent's knowledge, keep it private, and pay nothing for storage.

Won't the repo get huge? Memory files are text. A prolific agent generates 1-5 MB per month. Git handles this effortlessly. Binary indexes (which can be large) are gitignored.

What about vector embeddings? Embeddings are derived from source text. Rebuild locally after pull, just like npm install after clone. This keeps the repo small and avoids embedding model version conflicts.

Is it safe to sync agent memories via git? The sync script scans for known secret patterns (API keys, tokens) before every commit and aborts if any are found. Memory files are set to 600 permissions (owner-only). Use a private repository. For highly sensitive content, add GPG encryption before push (see Level 3: Encrypted Sync). Never use a public repo for agent memories -- they can contain learned context from private conversations.

Can multiple agents share memories? Each agent has its own directory by design. To share knowledge, an orchestrator reads from multiple directories. This prevents knowledge contamination while allowing explicit sharing.

When to Sync

Trigger	Action	Why
After a long agent session	Push	Capture knowledge before it's lost
Starting on a new machine	Pull	Restore accumulated intelligence
Before system maintenance	Push	Backup before disk operations
After adding a new agent	Push	Include seed memories
Periodically (cron)	Push	Catch forgotten manual syncs

License

This work is licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).

You are free to share and adapt this material for non-commercial purposes, as long as you give appropriate credit and distribute any derivatives under the same license.

Commercial use -- including use in paid products, SaaS offerings, corporate tooling, or commercial agent platforms -- requires written permission. Contact for licensing inquiries.

-m

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
sync.sh		sync.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

musecl-memory

Why This Pattern?

Architecture

Single Machine

Multiple Machines

How It Works

Push (local -> git)

Pull (git -> local)

Status

What Gets Synced

Design Principles

Source files are the truth

Per-agent isolation

Secret sanitization

Fail-safe operations

Reference Implementation

Scaling Up

Level 1: Automated Sync

Level 2: Multi-Machine

Level 3: Encrypted Sync

Level 4: Cloud Hybrid

When to Use What

Enhanced Experience with OpenClaw

FAQ

When to Sync

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

musecl-memory

Why This Pattern?

Architecture

Single Machine

Multiple Machines

How It Works

Push (local -> git)

Pull (git -> local)

Status

What Gets Synced

Design Principles

Source files are the truth

Per-agent isolation

Secret sanitization

Fail-safe operations

Reference Implementation

Scaling Up

Level 1: Automated Sync

Level 2: Multi-Machine

Level 3: Encrypted Sync

Level 4: Cloud Hybrid

When to Use What

Enhanced Experience with OpenClaw

FAQ

When to Sync

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages