Bug: mine --mode convos silently skips claude.ai exports due to sender/role field mismatch and 10MB file size limit
Environment
- MemPalace version: 3.1
- Python: 3.13
- Platform: Home Assistant add-on (Linux/Docker)
- Export source: claude.ai privacy export (Settings → Export Data)
Bug 1: sender vs role field mismatch in normalize.py
_try_claude_ai_json checks item.get("role") to identify message authors, but the claude.ai privacy export format uses "sender" instead of "role".
Actual claude.ai export structure:
{
"uuid": "...",
"sender": "human",
"content": [...],
"text": "..."
}
_try_claude_ai_json looks for:
role = item.get("role", "")
if role in ("user", "human") and text:
Since role is always None, zero messages are extracted and zero drawers are filed — silently.
Suggested fix (normalize.py):
role = item.get("role", item.get("sender", ""))
Bug 2: 10MB file size limit silently skips large exports
convo_miner.py defines:
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10 MB
A standard claude.ai export with ~130 conversations produces a conversations.json of ~38MB — nearly 4x the limit. The file is silently skipped in scan_convos with no warning or error message, making it very difficult to diagnose.
Workaround: Manually split conversations.json into batches of ~10 conversations before mining:
import json, os
with open('conversations.json') as f:
data = json.load(f)
batch_size = 10
for i in range(0, len(data), batch_size):
batch = data[i:i+batch_size]
with open(f'convos_{i:03d}.json', 'w') as f:
json.dump(batch, f)
Suggested fixes:
- Print a warning when a file is skipped due to size, e.g.:
⚠ Skipping large file (38MB > 10MB limit): conversations.json
- Raise the default limit — claude.ai exports are commonly 20–50MB
- Document the limit clearly in the README
Steps to Reproduce
- Export data from claude.ai (Settings → Account → Export Data)
- Place export files in a directory
- Run
mempalace mine /path/to/export/ --mode convos --wing claude-ai
- Observe: 0 conversation drawers filed despite valid export files present
Expected Behaviour
All conversations in conversations.json are normalised and filed into the palace.
Actual Behaviour
- Bug 1:
conversations.json is parsed but produces 0 messages due to sender/role mismatch — 0 drawers filed
- Bug 2:
conversations.json (38MB) silently skipped before parsing due to 10MB size limit — no warning shown
Bug:
mine --mode convossilently skips claude.ai exports due tosender/rolefield mismatch and 10MB file size limitEnvironment
Bug 1:
sendervsrolefield mismatch innormalize.py_try_claude_ai_jsonchecksitem.get("role")to identify message authors, but the claude.ai privacy export format uses"sender"instead of"role".Actual claude.ai export structure:
{ "uuid": "...", "sender": "human", "content": [...], "text": "..." }_try_claude_ai_jsonlooks for:Since
roleis alwaysNone, zero messages are extracted and zero drawers are filed — silently.Suggested fix (
normalize.py):Bug 2: 10MB file size limit silently skips large exports
convo_miner.pydefines:A standard claude.ai export with ~130 conversations produces a
conversations.jsonof ~38MB — nearly 4x the limit. The file is silently skipped inscan_convoswith no warning or error message, making it very difficult to diagnose.Workaround: Manually split
conversations.jsoninto batches of ~10 conversations before mining:Suggested fixes:
⚠ Skipping large file (38MB > 10MB limit): conversations.jsonSteps to Reproduce
mempalace mine /path/to/export/ --mode convos --wing claude-aiExpected Behaviour
All conversations in
conversations.jsonare normalised and filed into the palace.Actual Behaviour
conversations.jsonis parsed but produces 0 messages due tosender/rolemismatch — 0 drawers filedconversations.json(38MB) silently skipped before parsing due to 10MB size limit — no warning shown