Skip to content

bug: mempalace mine --dry-run crashes with TypeError on files assigned room=None #586

@mssteuer

Description

@mssteuer

Summary

mempalace mine --dry-run crashes with a TypeError in the summary printer when any file in the scan gets room=None instead of falling through to "general". The crash happens only in --dry-run mode because the real mine path skips the problematic bookkeeping. Harmless (dry-run), but confusing when you're trying to figure out what will get filed before committing.

Environment

  • Ubuntu 24.04, Python 3.12
  • mempalace from PyPI, ~/.local/lib/python3.12/site-packages/mempalace/
  • mempalace init run against ~/.hermes (a large, multi-topic directory)

Reproduce

cd ~/.hermes
mempalace init . --yes
mempalace mine . --limit 500 --dry-run

Output

[DRY RUN] ... → room:profiles (25 drawers)
...
=======================================================
  Done.
  Files processed: 500
  Files skipped (already filed): 0
  Drawers filed: 27501

  By room:
    profiles             479 files
    general              4 files
    scripts              4 files
Traceback (most recent call last):
  File ".../bin/mempalace", line 8, in <module>
    sys.exit(main())
  File ".../site-packages/mempalace/cli.py", line 540, in main
    dispatch[args.command](args)
  File ".../site-packages/mempalace/cli.py", line 87, in cmd_mine
    mine(...)
  File ".../site-packages/mempalace/miner.py", line 605, in mine
    print(f"    {room:20} {count} files")
                ^^^^^^^^^
TypeError: unsupported format string passed to NoneType.__format__

Root cause

In mempalace/miner.py, process_file() can return (0, None) when a file is unreadable or falls below MIN_CHUNK_SIZE:

# miner.py ~line 425-431
except OSError:
    return 0, None
content = content.strip()
if len(content) < MIN_CHUNK_SIZE:
    return 0, None

The caller then does:

drawers, room = process_file(...)  # room can be None here
if drawers == 0 and not dry_run:
    files_skipped += 1
else:
    total_drawers += drawers
    room_counts[room] += 1          # ← room=None lands in the counter

In real mine mode the drawers == 0 and not dry_run branch is taken, so None rooms get skipped before reaching the counter. In dry-run mode the second clause is always entered, so None can make it into room_counts. Then the summary print:

# miner.py line 605
for room, count in sorted(room_counts.items(), key=lambda x: x[1], reverse=True):
    print(f"    {room:20} {count} files")   # crashes on room=None

Suggested fix

Any one of:

  1. Defensive print (smallest diff):

    print(f"    {(room or 'general'):20} {count} files")
  2. Normalize at the source — guarantee process_file() never returns None for the room; return "general" or a sentinel like "_unreadable":

    except OSError:
        return 0, "general"
    if len(content) < MIN_CHUNK_SIZE:
        return 0, "general"
  3. Filter the counter — skip None rooms before printing:

    for room, count in sorted(
        ((r, c) for r, c in room_counts.items() if r is not None),
        key=lambda x: x[1], reverse=True,
    ):
        ...

Option 2 is probably the cleanest because it also fixes any other downstream consumer that assumes room is always a string.

Severity

Cosmetic / workflow bug — real mines work fine. But --dry-run is exactly the mode you'd use to preview a large mine before committing, so crashing at the end of the preview defeats the point.

Reporter

Filed by @mssteuer on behalf of Jean Clawd, a Hermes agent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions