Portable persistence for AI agent memory, using git.
Your agent learned 500 things last month. Then the disk died. Now it knows nothing.
AI agents accumulate knowledge -- corrections, preferences, patterns, domain expertise. But that knowledge is trapped on one machine. If the disk fills, the VM recycles, or you move to new hardware, all accumulated intelligence is gone.
musecl-memory fixes this by applying a pattern every developer already knows:
package.jsonsyncs.node_modulesrebuilds locally.
Memory source files (Markdown, JSON) sync via git. Vector indexes rebuild locally from those files. The repo stays small, portable, and free of binary bloat.
Traditional Software AI Agent Memory
====================== ======================
package.json <---> MEMORY.md (source)
package-lock.json <---> patterns.json (source)
node_modules/ <---> vectordb/, *.sqlite (derived)
npm install <---> agent memory index --force
git push/pull <---> sync.sh push/pull
Current approaches have a portability gap:
| Approach | Portable? | Private? | Free? | Auditable? |
|---|---|---|---|---|
| Cloud memory APIs | Yes | No | No | No |
| Local-only memory systems | No | Yes | Yes | Partially |
| Protocol-based memory services | No | Yes | Yes | Yes |
| musecl-memory | Yes | Yes | Yes | Yes |
Git Repository (private) Local Clone Agent Workspaces
+---------------------+ +---------------------+ +---------------------+
| | | | | |
| agent-a/ | | ~/memory-sync/ | | ~/agent-a/memory/ |
| MEMORY.md |<-->| (git push/pull) |<-->| ~/agent-b/memory/ |
| agent-b/ | | | | |
| MEMORY.md | +---------------------+ +---------------------+
| .gitignore | |
+---------------------+ index (rebuild)
v
+---------------------+
| Vector Indexes |
| (rebuilt locally, |
| NOT synced) |
+---------------------+
+------------------+ +------------------+ +------------------+
| | push | | pull | |
| Agent Memory |--------->| Git Repository |--------->| Agent Memory |
| (Machine A) | | (git remote) | | (Machine B) |
| |<---------| |<---------| |
| .md .json files | pull | .md .json files | push | .md .json files |
+------------------+ +------------------+ +------------------+
| |
index (local) index (local)
v v
+------------------+ +------------------+
| Vector Index | | Vector Index |
| (NOT synced) | | (NOT synced) |
+------------------+ +------------------+
Each machine rebuilds its own indexes from the shared source files. No binary conflicts, no embedding model version mismatches.
[Agent Workspace] [Sync Repo] [Git Remote]
| | |
|-- content hash compare -------->| |
| (skip unchanged) | |
|-- copy changed files ---------->| |
| |-- sanitize (secrets) |
| |-- git add + commit |
| |-- git push ---------->|
- Compare workspace files against sync repo by content hash (not timestamp)
- Copy only changed
.mdand.jsonfiles - Scan for secrets -- abort if API keys or tokens are found
- Commit and push
[Git Remote] [Sync Repo] [Agent Workspace]
| | |
|-- git pull (ff-only) --->| |
| |-- copy files ------------------>|
| | (chmod 600) |
| | |-- rebuild index
git pull --ff-only-- fails loudly if diverged (protects local changes)- Copy files to agent workspaces with 600 permissions
- Rebuild vector indexes for agents that received changes
Bidirectional drift check:
- Files in workspace but not in sync repo (new local knowledge, needs push)
- Files in sync repo but not in workspace (missing restore, needs pull)
| Synced (git-tracked) | Not Synced (gitignored) |
|---|---|
*.md -- Memory documents |
*.sqlite -- Vector indexes |
*.json -- Structured data |
vectordb/ -- Embedding stores |
sync.sh -- The sync script |
*.jsonl -- Session transcripts |
.gitignore |
*.key, *.pem, *.env -- Secrets |
Memory is plain text -- human-readable, diffable, version-controlled. Binary indexes are derived artifacts, rebuilt locally, never committed.
Each agent gets its own directory. Security knowledge doesn't pollute architecture patterns. This maps naturally to how multi-agent systems partition responsibilities.
sync-repo/
agent-alpha/
MEMORY.md
patterns.md
daily-log.json
agent-beta/
MEMORY.md
domain-notes.md
agent-gamma/
MEMORY.md
Memory files can inadvertently capture API keys from agent conversations. The sync script scans for known patterns before every commit and aborts if any are found.
Pulls use --ff-only to prevent silent merge conflicts. Content is compared by hash, not timestamp, so changes are detected reliably across machines. Drift checks run in both directions to catch files that exist on one side but not the other.
A complete sync script with push, pull, and status: sync.sh
Customize AGENTS and AGENTS_DIR for your setup, then:
./sync.sh push # Copy local memory to git and push
./sync.sh pull # Pull from git and restore to workspaces
./sync.sh status # Show per-agent driftThe pattern starts simple and grows with your needs.
# Cron: push every 6 hours
0 */6 * * * cd ~/memory-sync && bash sync.sh pushFor agents running on multiple machines simultaneously:
| Strategy | Tradeoff |
|---|---|
| Branch-per-machine | Clean history, periodic merge needed |
| Append-only logs | No conflicts (datestamped files), larger repo |
| Last-writer-wins | Simplest, works if sessions don't overlap |
For sensitive memory content:
# Encrypt before push
gpg --symmetric --cipher-algo AES256 -o "$dest.gpg" "$file"
# Decrypt after pull
gpg --decrypt -o "$dest" "$file.gpg"The pattern extends to cloud without losing its core properties. Git stays the source of truth; cloud services are consumers of it.
+---------------------+
| Cloud Services |
| (optional layer) |
| |
| - Object storage |
| - Managed vector DB|
| - Webhook triggers |
+---------------------+
^
| deploy (CI/CD)
|
+---------------------+
| Git Repository | <-- still the source of truth
| (private) |
+---------------------+
| ^
pull | | push
v |
+---------------------+
| Local Workspaces |
+---------------------+
Cloud extension patterns:
- CI/CD index rebuild -- Push triggers a pipeline that rebuilds vector indexes and deploys them to a managed database. Any machine queries the cloud index instead of rebuilding locally.
- Object storage mirror -- Mirror the repo for serverless agents that need read-only access without git.
- Webhook notifications -- Push triggers a webhook that tells running agents to refresh their context, eliminating sync lag.
- Shared memory API -- Wrap the repo in a lightweight HTTP endpoint. Agents query memories without needing local file access.
| Scenario | Recommendation |
|---|---|
| Single machine, personal use | Git only |
| 2-3 machines, same person | Git pull on each machine |
| Team of agents across infra | Cloud hybrid |
| Serverless / ephemeral agents | Cloud hybrid |
| Air-gapped / offline | Git bundles for transfer |
Cloud adds convenience and scale. It doesn't improve privacy, auditability, or cost -- it trades those. Start local, add cloud when you outgrow it.
OpenClaw is an open-source AI agent framework that natively manages per-agent memory workspaces, vector indexing, and multi-agent orchestration. It provides the local runtime that musecl-memory complements.
musecl-memory OpenClaw
=============== ========
Portable persistence (git) + Local agent runtime
Push/pull memory files + Auto-capture/auto-recall
Secret sanitization + Per-agent workspace isolation
Cross-machine sync + Vector indexing and search
OpenClaw manages the local memory lifecycle -- auto-capturing knowledge from conversations and auto-recalling relevant context each turn. musecl-memory adds portability: syncing those files to git so they survive machine changes, disk failures, or infrastructure migrations.
# OpenClaw agents generate memory files during sessions
# (auto-capture writes to ~/agent-workspace/memory/)
# Sync to git
sync.sh push
# On a new machine, restore everything
sync.sh pull
openclaw memory index --force --agent alpha
# Agent immediately has all accumulated knowledge| Without OpenClaw | With OpenClaw |
|---|---|
| Manual memory file management | Auto-capture from conversations |
| Manual index rebuilds | Built-in openclaw memory index |
| Single agent per workspace | Multi-agent with isolated workspaces |
| File-based recall only | Semantic vector search across memories |
| Push/pull only | Auto-recall injects context per turn |
The pattern works with any agent framework. OpenClaw removes the manual steps.
Why git and not object storage or file sync services?
Git gives you versioning, diffing, branching, and collaboration for free. You can see exactly what your agent learned and when. Try doing git log --oneline on a synced folder.
Why not a cloud memory API? Cloud APIs are great for managed infrastructure. This pattern is for people who want to own their agent's knowledge, keep it private, and pay nothing for storage.
Won't the repo get huge? Memory files are text. A prolific agent generates 1-5 MB per month. Git handles this effortlessly. Binary indexes (which can be large) are gitignored.
What about vector embeddings?
Embeddings are derived from source text. Rebuild locally after pull, just like npm install after clone. This keeps the repo small and avoids embedding model version conflicts.
Is it safe to sync agent memories via git? The sync script scans for known secret patterns (API keys, tokens) before every commit and aborts if any are found. Memory files are set to 600 permissions (owner-only). Use a private repository. For highly sensitive content, add GPG encryption before push (see Level 3: Encrypted Sync). Never use a public repo for agent memories -- they can contain learned context from private conversations.
Can multiple agents share memories? Each agent has its own directory by design. To share knowledge, an orchestrator reads from multiple directories. This prevents knowledge contamination while allowing explicit sharing.
| Trigger | Action | Why |
|---|---|---|
| After a long agent session | Push | Capture knowledge before it's lost |
| Starting on a new machine | Pull | Restore accumulated intelligence |
| Before system maintenance | Push | Backup before disk operations |
| After adding a new agent | Push | Include seed memories |
| Periodically (cron) | Push | Catch forgotten manual syncs |
Copyright 2026 -m. All rights reserved.
This work is licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).
You are free to share and adapt this material for non-commercial purposes, as long as you give appropriate credit and distribute any derivatives under the same license.
Commercial use -- including use in paid products, SaaS offerings, corporate tooling, or commercial agent platforms -- requires written permission. Contact for licensing inquiries.
-m