feat: GPU-accelerated embeddings, batch processing, and incremental update#351
Closed
phobicdotno wants to merge 11 commits intoMemPalace:mainfrom
Closed
feat: GPU-accelerated embeddings, batch processing, and incremental update#351phobicdotno wants to merge 11 commits intoMemPalace:mainfrom
phobicdotno wants to merge 11 commits intoMemPalace:mainfrom
Conversation
- Run ruff format across mempalace/ and tests/ - Fix multi-imports in test_config.py (split to separate lines) - Fix unused variable in test_embeddings.py (add tautological assert) - Add docstrings to all public functions in embeddings.py - Use flush_batch() return value for total_drawers count in mine() - Extract room from drawer metadata instead of double detect_room() call - Skip collection creation during dry-run in update() - Remove dead add_drawer() function from miner.py - Cache resolved device instead of preference string in embeddings.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
embeddings.pymodule provides CUDA-aware embedding viasentence-transformerswhen available, with graceful fallback to ChromaDB's default ONNX model (CPU)collection.add()calls batched (100 docs per call instead of 1), dramatically reducing overhead for large directoriesmempalace updatecommand detects new/changed/deleted files via content hashing and syncs the palace without full re-mine--device auto|cuda|cpuCLI flag,MEMPALACE_DEVICEenv var, andconfig.jsondevicepropertypip install mempalace[gpu]), base install unchangedChanges
New files
mempalace/embeddings.py— Shared embedding function factory with device detection, collection wrapper, and batch flushtests/test_embeddings.py— 6 tests for embeddings moduleModified files
mempalace/miner.py— Batch processing inmine(), content hashing, newupdate()functionmempalace/convo_miner.py— Batch processing inmine_convos()mempalace/config.py—deviceproperty (auto/cuda/cpu)mempalace/cli.py—--deviceflag,updatesubcommandmempalace/searcher.py— Shared embedding function for vector compatibilitymempalace/mcp_server.py— Shared embedding functionmempalace/layers.py— Shared embedding function (5 sites)mempalace/palace_graph.py— Shared embedding functionpyproject.toml—gpuoptional dependency grouptests/test_config.py— Device config testsArchitecture notes
embeddings.get_collection()to ensure embedding vector compatibility across mine/search/MCPsentence-transformersall-MiniLM-L6-v2 produces identical vectors to ChromaDB's default ONNX model — existing palaces remain compatibleTest plan
pytest tests/ -v)mempalace mine <dir> --device cudauses GPUmempalace mine <dir> --device cpufalls back to batched CPUmempalace update <dir>detects new/changed/deleted filesmempalace searchworks with GPU-embedded vectors