fix: harden palace.py mtime check and expand test coverage#429
fix: harden palace.py mtime check and expand test coverage#429Nitrogonza9 wants to merge 1 commit intoMemPalace:developfrom
Conversation
Replace exact float equality in file_already_mined() with a tolerance check (< 0.01s) to prevent false negatives from serialization rounding. Add debug logging to get_collection() fallback path for traceability. New test_palace.py with direct unit tests for get_collection, file_already_mined (including mtime tolerance and modification detection), and SKIP_DIRS. Add searcher edge-case tests for nonexistent wing/room filters and special character queries. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
web3guru888
left a comment
There was a problem hiding this comment.
Clean fix for a real bug. The float equality issue with mtime is a known pitfall — ChromaDB stores metadata values through a serialization layer that can introduce tiny rounding errors, making exact == comparisons fail silently and causing unnecessary re-mining.
The tolerance of 0.01s is well-chosen. Filesystem mtime granularity varies:
- ext4: nanosecond
- FAT32: 2 seconds
- NTFS: 100 nanoseconds
- HFS+: 1 second
So 0.01s is tight enough to detect real modifications but loose enough to absorb serialization rounding. On FAT32 you'd need a larger tolerance, but that's a niche concern.
The debug logging on the get_collection() fallback path is a good addition — silent collection creation was a debugging pain point we hit too.
The test coverage is thorough:
- Exact mtime match ✅
- Tiny rounding difference (tolerance) ✅
- Real modification detection (10s delta) ✅
- Missing
source_mtimemetadata ✅
The searcher edge-case tests (nonexistent wing/room, special characters) are bonus value. We've seen special character queries cause issues with some ChromaDB versions, so having those tests is good regression coverage.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
|
Thanks @web3guru888 — good context on the FAT32 granularity. If someone reports issues on FAT32 we can bump the tolerance in a follow-up, but for now 0.01s covers the realistic cases. Appreciate you reviewing all six PRs. Solid feedback across the board. — Gonzalo |
|
Makes sense — 0.01s covers FAT32 would need 2s in theory, but anyone running MemPalace on a FAT32 drive is solving an unusual problem. If it comes up, bumping the constant is a one-line fix. Thanks for the thorough review across all the PRs, Gonzalo — good work on these. |
|
closing — the float equality fix landed via #610 (epsilon mtime comparison, merged). the debug logging part is a good idea if you want to submit it separately. thanks @Nitrogonza9! |
Summary
file_already_mined()— replace exact==comparison with tolerance check (< 0.01s) to prevent false negatives caused by float serialization rounding through ChromaDB metadataget_collection()fallback path so silent collection creation is traceabletest_palace.pywith 11 direct unit tests coveringget_collection,file_already_mined(mtime tolerance, modification detection, missing metadata), andSKIP_DIRSTest plan
pytest tests/test_palace.py -v— all 11 new tests passpytest tests/test_searcher.py -v— all 17 tests pass (13 existing + 4 new)pytest tests/ -v— full suite 549 passed, 0 failedruff check— no lint errors🤖 Generated with Claude Code