Skip to content

mine --mode convos silently skips claude.ai exports due to sender/role field mismatch and 10MB file size limit #602

@carlito1979

Description

@carlito1979

Bug: mine --mode convos silently skips claude.ai exports due to sender/role field mismatch and 10MB file size limit

Environment

  • MemPalace version: 3.1
  • Python: 3.13
  • Platform: Home Assistant add-on (Linux/Docker)
  • Export source: claude.ai privacy export (Settings → Export Data)

Bug 1: sender vs role field mismatch in normalize.py

_try_claude_ai_json checks item.get("role") to identify message authors, but the claude.ai privacy export format uses "sender" instead of "role".

Actual claude.ai export structure:

{
  "uuid": "...",
  "sender": "human",
  "content": [...],
  "text": "..."
}

_try_claude_ai_json looks for:

role = item.get("role", "")
if role in ("user", "human") and text:

Since role is always None, zero messages are extracted and zero drawers are filed — silently.

Suggested fix (normalize.py):

role = item.get("role", item.get("sender", ""))

Bug 2: 10MB file size limit silently skips large exports

convo_miner.py defines:

MAX_FILE_SIZE = 10 * 1024 * 1024  # 10 MB

A standard claude.ai export with ~130 conversations produces a conversations.json of ~38MB — nearly 4x the limit. The file is silently skipped in scan_convos with no warning or error message, making it very difficult to diagnose.

Workaround: Manually split conversations.json into batches of ~10 conversations before mining:

import json, os
with open('conversations.json') as f:
    data = json.load(f)
batch_size = 10
for i in range(0, len(data), batch_size):
    batch = data[i:i+batch_size]
    with open(f'convos_{i:03d}.json', 'w') as f:
        json.dump(batch, f)

Suggested fixes:

  • Print a warning when a file is skipped due to size, e.g.:
    ⚠ Skipping large file (38MB > 10MB limit): conversations.json
  • Raise the default limit — claude.ai exports are commonly 20–50MB
  • Document the limit clearly in the README

Steps to Reproduce

  1. Export data from claude.ai (Settings → Account → Export Data)
  2. Place export files in a directory
  3. Run mempalace mine /path/to/export/ --mode convos --wing claude-ai
  4. Observe: 0 conversation drawers filed despite valid export files present

Expected Behaviour

All conversations in conversations.json are normalised and filed into the palace.

Actual Behaviour

  • Bug 1: conversations.json is parsed but produces 0 messages due to sender/role mismatch — 0 drawers filed
  • Bug 2: conversations.json (38MB) silently skipped before parsing due to 10MB size limit — no warning shown

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions