fix: return room from process_file to fix stats double-detection by ichoosetoaccept · Pull Request #45 · detailobsessed/mempalace

ichoosetoaccept · 2026-04-12T23:56:34Z

Closes #

linear · 2026-04-12T23:56:37Z

DOT-481 Port upstream fix: miner room stats double-detection (upstream #165)

ichoosetoaccept · 2026-04-12T23:57:30Z

This change is part of the following stack:

fix: return room from process_file to fix stats double-detection #45 ◀

_{Change managed by git-spice.}

greptile-apps · 2026-04-13T00:00:12Z

Greptile Summary

This PR changes process_file to return (drawer_count, room) instead of just drawer_count, so mine() can reuse the room determined inside process_file for stats rather than calling detect_room a second time. All call sites and tests are updated to unpack the tuple.

When dry_run=True, files skipped early (too small, OSError) return (0, ""). The guard if drawers == 0 and not dry_run: is False in dry-run mode, so execution falls to the else branch and room_counts[""] += 1. The dry-run "By room:" summary would then display an entry with a blank room name for each such file.

Confidence Score: 5/5

Safe to merge; the core fix is correct and all tests pass. The one finding is a cosmetic dry-run edge case.

The fix correctly eliminates redundant room detection and all callers are updated. The only remaining finding is a P2: dry-run summaries could show a blank room name for files too small to process, which does not affect real mine runs or data integrity.

src/mempalace/miner.py — the else-branch in mine() around line 627

Important Files Changed

Filename	Overview
src/mempalace/miner.py	process_file now returns (drawer_count, room) tuple; mine() unpacks it correctly for stats, but dry-run mode doesn't guard against empty-room entries when a file is skipped early
tests/test_miner.py	Unit tests correctly updated to unpack the (count, room) tuple from process_file; all new assertions are valid
tests/test_mining_integration.py	Integration tests updated to handle the tuple return; new test_process_file_dry_run validates room is correctly returned in dry-run mode

Sequence Diagram

sequenceDiagram
    participant mine
    participant process_file
    participant detect_room
    participant ChromaDB

    Note over mine,ChromaDB: After PR — single room detection
    mine->>process_file: filepath, rooms, dry_run, ...
    process_file->>detect_room: filepath, content, rooms
    detect_room-->>process_file: room (str)
    process_file->>ChromaDB: delete(source_file)
    process_file->>ChromaDB: add drawers
    ChromaDB-->>process_file: ok
    process_file-->>mine: (drawer_count, room)
    mine->>mine: room_counts[room] += 1

    Note over mine,ChromaDB: Before PR — room detected twice
    mine->>process_file: filepath, rooms, dry_run, ...
    process_file->>detect_room: filepath, content, rooms
    detect_room-->>process_file: room (internal, unused by caller)
    process_file-->>mine: drawer_count (int only)
    mine->>detect_room: filepath, content, rooms
    detect_room-->>mine: room (redundant call)
    mine->>mine: room_counts[room] += 1

Comments Outside Diff (1)

src/mempalace/miner.py, line 627-633 (link)

Empty-string room recorded in dry-run stats for skipped files

When dry_run=True, early-exit paths inside process_file (file too small, OSError) return (0, ""). The guard drawers == 0 and not dry_run evaluates to False in dry-run mode, so execution falls through to the else branch and room_counts[""] += 1. The final "By room:" summary would display a blank-named room entry for each such file.

A minimal fix is to also gate on whether a room was actually detected:

_{Reviews (1): Last reviewed commit: "fix: return room from process_file to fi..." | Re-trigger Greptile}

Remove the ai_lines[:8] truncation in _chunk_by_exchange() so full AI responses are stored. Add _register_empty_file() sentinel drawer for conversation files that produce 0 chunks, preventing infinite re-processing on subsequent runs. Also fix pre-existing test_miner assertion that expected int from process_file (now returns tuple since PR #45). Ports upstream MemPalace#654, MemPalace#692, MemPalace#695.

fix: return room from process_file to fix stats double-detection

520a968

ichoosetoaccept force-pushed the fix/dot-481-room-stats branch from ba30125 to 520a968 Compare April 13, 2026 06:39

ichoosetoaccept merged commit 8ffa9c1 into bleeding Apr 13, 2026
6 of 12 checks passed

ichoosetoaccept deleted the fix/dot-481-room-stats branch April 13, 2026 06:46

ichoosetoaccept mentioned this pull request Apr 13, 2026

fix: preserve full AI response and add sentinel for 0-chunk files #48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: return room from process_file to fix stats double-detection#45

fix: return room from process_file to fix stats double-detection#45
ichoosetoaccept merged 1 commit intobleedingfrom
fix/dot-481-room-stats

ichoosetoaccept commented Apr 12, 2026

Uh oh!

linear Bot commented Apr 12, 2026

Uh oh!

ichoosetoaccept commented Apr 12, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 13, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ichoosetoaccept commented Apr 12, 2026

Uh oh!

linear Bot commented Apr 12, 2026

Uh oh!

ichoosetoaccept commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ichoosetoaccept commented Apr 12, 2026 •

edited

Loading

greptile-apps Bot commented Apr 13, 2026 •

edited

Loading