Summary
file_already_mined() in palace.py:68 uses strict float equality to compare stored vs current mtime:
return float(stored_mtime) == current_mtime
os.path.getmtime() returns a float. ChromaDB stores metadata via JSON serialization, which introduces floating-point precision loss (e.g., 1712345678.123456 may round-trip as 1712345678.1234560012817383). The strict == comparison frequently fails even for unchanged files, causing every file to be re-mined on every run.
This defeats the entire dedup/skip mechanism and silently bloats the palace with duplicate drawers.
Reproduction
mempalace init <dir> && mempalace mine <dir> — initial mine
mempalace mine <dir> — re-mine without changing any files
- Observe that all files are processed again (not skipped)
Suggested Fix
Use epsilon comparison:
return abs(float(stored_mtime) - current_mtime) < 0.01
Or truncate to integer seconds:
return int(float(stored_mtime)) == int(current_mtime)
Environment
- mempalace 3.1.0
- ChromaDB 0.6.3
- Linux x86_64, Python 3.12
Summary
file_already_mined()inpalace.py:68uses strict float equality to compare stored vs current mtime:os.path.getmtime()returns a float. ChromaDB stores metadata via JSON serialization, which introduces floating-point precision loss (e.g.,1712345678.123456may round-trip as1712345678.1234560012817383). The strict==comparison frequently fails even for unchanged files, causing every file to be re-mined on every run.This defeats the entire dedup/skip mechanism and silently bloats the palace with duplicate drawers.
Reproduction
mempalace init <dir> && mempalace mine <dir>— initial minemempalace mine <dir>— re-mine without changing any filesSuggested Fix
Use epsilon comparison:
Or truncate to integer seconds:
Environment