Skip to content

musecl/musecl-memory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

musecl-memory

Portable persistence for AI agent memory, using git.

Your agent learned 500 things last month. Then the disk died. Now it knows nothing.

AI agents accumulate knowledge -- corrections, preferences, patterns, domain expertise. But that knowledge is trapped on one machine. If the disk fills, the VM recycles, or you move to new hardware, all accumulated intelligence is gone.

musecl-memory fixes this by applying a pattern every developer already knows:

package.json syncs. node_modules rebuilds locally.

Memory source files (Markdown, JSON) sync via git. Vector indexes rebuild locally from those files. The repo stays small, portable, and free of binary bloat.

Traditional Software                    AI Agent Memory
======================                  ======================
package.json         <--->              MEMORY.md (source)
package-lock.json    <--->              patterns.json (source)
node_modules/        <--->              vectordb/, *.sqlite (derived)
npm install          <--->              agent memory index --force
git push/pull        <--->              sync.sh push/pull

Why This Pattern?

Current approaches have a portability gap:

Approach Portable? Private? Free? Auditable?
Cloud memory APIs Yes No No No
Local-only memory systems No Yes Yes Partially
Protocol-based memory services No Yes Yes Yes
musecl-memory Yes Yes Yes Yes

Architecture

Single Machine

  Git Repository (private)        Local Clone            Agent Workspaces
 +---------------------+    +---------------------+    +---------------------+
 |                     |    |                     |    |                     |
 |  agent-a/           |    |  ~/memory-sync/     |    |  ~/agent-a/memory/  |
 |    MEMORY.md        |<-->|    (git push/pull)  |<-->|  ~/agent-b/memory/  |
 |  agent-b/           |    |                     |    |                     |
 |    MEMORY.md        |    +---------------------+    +---------------------+
 |  .gitignore         |                                        |
 +---------------------+                                   index (rebuild)
                                                                v
                                                       +---------------------+
                                                       |  Vector Indexes     |
                                                       |  (rebuilt locally,  |
                                                       |   NOT synced)       |
                                                       +---------------------+

Multiple Machines

+------------------+          +------------------+          +------------------+
|                  |  push    |                  |   pull   |                  |
|  Agent Memory    |--------->|  Git Repository  |--------->|  Agent Memory    |
|  (Machine A)     |          |  (git remote)    |          |  (Machine B)     |
|                  |<---------|                  |<---------|                  |
|  .md .json files |   pull   |  .md .json files |   push   |  .md .json files |
+------------------+          +------------------+          +------------------+
        |                                                           |
   index (local)                                               index (local)
        v                                                           v
+------------------+                                        +------------------+
|  Vector Index    |                                        |  Vector Index    |
|  (NOT synced)    |                                        |  (NOT synced)    |
+------------------+                                        +------------------+

Each machine rebuilds its own indexes from the shared source files. No binary conflicts, no embedding model version mismatches.


How It Works

Push (local -> git)

  [Agent Workspace]                    [Sync Repo]               [Git Remote]
        |                                  |                        |
        |-- content hash compare -------->|                        |
        |   (skip unchanged)              |                        |
        |-- copy changed files ---------->|                        |
        |                                 |-- sanitize (secrets)   |
        |                                 |-- git add + commit     |
        |                                 |-- git push ---------->|
  1. Compare workspace files against sync repo by content hash (not timestamp)
  2. Copy only changed .md and .json files
  3. Scan for secrets -- abort if API keys or tokens are found
  4. Commit and push

Pull (git -> local)

  [Git Remote]               [Sync Repo]                    [Agent Workspace]
     |                           |                                  |
     |-- git pull (ff-only) --->|                                  |
     |                          |-- copy files ------------------>|
     |                          |   (chmod 600)                   |
     |                          |                                  |-- rebuild index
  1. git pull --ff-only -- fails loudly if diverged (protects local changes)
  2. Copy files to agent workspaces with 600 permissions
  3. Rebuild vector indexes for agents that received changes

Status

Bidirectional drift check:

  • Files in workspace but not in sync repo (new local knowledge, needs push)
  • Files in sync repo but not in workspace (missing restore, needs pull)

What Gets Synced

Synced (git-tracked) Not Synced (gitignored)
*.md -- Memory documents *.sqlite -- Vector indexes
*.json -- Structured data vectordb/ -- Embedding stores
sync.sh -- The sync script *.jsonl -- Session transcripts
.gitignore *.key, *.pem, *.env -- Secrets

Design Principles

Source files are the truth

Memory is plain text -- human-readable, diffable, version-controlled. Binary indexes are derived artifacts, rebuilt locally, never committed.

Per-agent isolation

Each agent gets its own directory. Security knowledge doesn't pollute architecture patterns. This maps naturally to how multi-agent systems partition responsibilities.

sync-repo/
  agent-alpha/
    MEMORY.md
    patterns.md
    daily-log.json
  agent-beta/
    MEMORY.md
    domain-notes.md
  agent-gamma/
    MEMORY.md

Secret sanitization

Memory files can inadvertently capture API keys from agent conversations. The sync script scans for known patterns before every commit and aborts if any are found.

Fail-safe operations

Pulls use --ff-only to prevent silent merge conflicts. Content is compared by hash, not timestamp, so changes are detected reliably across machines. Drift checks run in both directions to catch files that exist on one side but not the other.


Reference Implementation

A complete sync script with push, pull, and status: sync.sh

Customize AGENTS and AGENTS_DIR for your setup, then:

./sync.sh push      # Copy local memory to git and push
./sync.sh pull      # Pull from git and restore to workspaces
./sync.sh status    # Show per-agent drift

Scaling Up

The pattern starts simple and grows with your needs.

Level 1: Automated Sync

# Cron: push every 6 hours
0 */6 * * * cd ~/memory-sync && bash sync.sh push

Level 2: Multi-Machine

For agents running on multiple machines simultaneously:

Strategy Tradeoff
Branch-per-machine Clean history, periodic merge needed
Append-only logs No conflicts (datestamped files), larger repo
Last-writer-wins Simplest, works if sessions don't overlap

Level 3: Encrypted Sync

For sensitive memory content:

# Encrypt before push
gpg --symmetric --cipher-algo AES256 -o "$dest.gpg" "$file"
# Decrypt after pull
gpg --decrypt -o "$dest" "$file.gpg"

Level 4: Cloud Hybrid

The pattern extends to cloud without losing its core properties. Git stays the source of truth; cloud services are consumers of it.

                    +---------------------+
                    |  Cloud Services     |
                    |  (optional layer)   |
                    |                     |
                    |  - Object storage   |
                    |  - Managed vector DB|
                    |  - Webhook triggers |
                    +---------------------+
                              ^
                              | deploy (CI/CD)
                              |
                    +---------------------+
                    |  Git Repository     |  <-- still the source of truth
                    |  (private)          |
                    +---------------------+
                          |       ^
                     pull |       | push
                          v       |
                    +---------------------+
                    |  Local Workspaces   |
                    +---------------------+

Cloud extension patterns:

  • CI/CD index rebuild -- Push triggers a pipeline that rebuilds vector indexes and deploys them to a managed database. Any machine queries the cloud index instead of rebuilding locally.
  • Object storage mirror -- Mirror the repo for serverless agents that need read-only access without git.
  • Webhook notifications -- Push triggers a webhook that tells running agents to refresh their context, eliminating sync lag.
  • Shared memory API -- Wrap the repo in a lightweight HTTP endpoint. Agents query memories without needing local file access.

When to Use What

Scenario Recommendation
Single machine, personal use Git only
2-3 machines, same person Git pull on each machine
Team of agents across infra Cloud hybrid
Serverless / ephemeral agents Cloud hybrid
Air-gapped / offline Git bundles for transfer

Cloud adds convenience and scale. It doesn't improve privacy, auditability, or cost -- it trades those. Start local, add cloud when you outgrow it.


Enhanced Experience with OpenClaw

OpenClaw is an open-source AI agent framework that natively manages per-agent memory workspaces, vector indexing, and multi-agent orchestration. It provides the local runtime that musecl-memory complements.

musecl-memory                        OpenClaw
===============                        ========
Portable persistence (git)     +       Local agent runtime
Push/pull memory files         +       Auto-capture/auto-recall
Secret sanitization            +       Per-agent workspace isolation
Cross-machine sync             +       Vector indexing and search

OpenClaw manages the local memory lifecycle -- auto-capturing knowledge from conversations and auto-recalling relevant context each turn. musecl-memory adds portability: syncing those files to git so they survive machine changes, disk failures, or infrastructure migrations.

# OpenClaw agents generate memory files during sessions
# (auto-capture writes to ~/agent-workspace/memory/)

# Sync to git
sync.sh push

# On a new machine, restore everything
sync.sh pull
openclaw memory index --force --agent alpha

# Agent immediately has all accumulated knowledge
Without OpenClaw With OpenClaw
Manual memory file management Auto-capture from conversations
Manual index rebuilds Built-in openclaw memory index
Single agent per workspace Multi-agent with isolated workspaces
File-based recall only Semantic vector search across memories
Push/pull only Auto-recall injects context per turn

The pattern works with any agent framework. OpenClaw removes the manual steps.


FAQ

Why git and not object storage or file sync services? Git gives you versioning, diffing, branching, and collaboration for free. You can see exactly what your agent learned and when. Try doing git log --oneline on a synced folder.

Why not a cloud memory API? Cloud APIs are great for managed infrastructure. This pattern is for people who want to own their agent's knowledge, keep it private, and pay nothing for storage.

Won't the repo get huge? Memory files are text. A prolific agent generates 1-5 MB per month. Git handles this effortlessly. Binary indexes (which can be large) are gitignored.

What about vector embeddings? Embeddings are derived from source text. Rebuild locally after pull, just like npm install after clone. This keeps the repo small and avoids embedding model version conflicts.

Is it safe to sync agent memories via git? The sync script scans for known secret patterns (API keys, tokens) before every commit and aborts if any are found. Memory files are set to 600 permissions (owner-only). Use a private repository. For highly sensitive content, add GPG encryption before push (see Level 3: Encrypted Sync). Never use a public repo for agent memories -- they can contain learned context from private conversations.

Can multiple agents share memories? Each agent has its own directory by design. To share knowledge, an orchestrator reads from multiple directories. This prevents knowledge contamination while allowing explicit sharing.


When to Sync

Trigger Action Why
After a long agent session Push Capture knowledge before it's lost
Starting on a new machine Pull Restore accumulated intelligence
Before system maintenance Push Backup before disk operations
After adding a new agent Push Include seed memories
Periodically (cron) Push Catch forgotten manual syncs

License

Copyright 2026 -m. All rights reserved.

This work is licensed under CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).

You are free to share and adapt this material for non-commercial purposes, as long as you give appropriate credit and distribute any derivatives under the same license.

Commercial use -- including use in paid products, SaaS offerings, corporate tooling, or commercial agent platforms -- requires written permission. Contact for licensing inquiries.


-m

Releases

No releases published

Packages

 
 
 

Contributors

Languages