feat: GPU-accelerated embeddings, batch processing, and incremental update by phobicdotno · Pull Request #351 · MemPalace/mempalace

phobicdotno · 2026-04-09T08:10:28Z

Summary

GPU acceleration: New embeddings.py module provides CUDA-aware embedding via sentence-transformers when available, with graceful fallback to ChromaDB's default ONNX model (CPU)
Batch processing: collection.add() calls batched (100 docs per call instead of 1), dramatically reducing overhead for large directories
Incremental update: New mempalace update command detects new/changed/deleted files via content hashing and syncs the palace without full re-mine
Device selection: --device auto|cuda|cpu CLI flag, MEMPALACE_DEVICE env var, and config.json device property
Zero new required dependencies: GPU support is an optional extra (pip install mempalace[gpu]), base install unchanged

Changes

New files

mempalace/embeddings.py — Shared embedding function factory with device detection, collection wrapper, and batch flush
tests/test_embeddings.py — 6 tests for embeddings module

Modified files

mempalace/miner.py — Batch processing in mine(), content hashing, new update() function
mempalace/convo_miner.py — Batch processing in mine_convos()
mempalace/config.py — device property (auto/cuda/cpu)
mempalace/cli.py — --device flag, update subcommand
mempalace/searcher.py — Shared embedding function for vector compatibility
mempalace/mcp_server.py — Shared embedding function
mempalace/layers.py — Shared embedding function (5 sites)
mempalace/palace_graph.py — Shared embedding function
pyproject.toml — gpu optional dependency group
tests/test_config.py — Device config tests

Architecture notes

All ChromaDB collection access goes through embeddings.get_collection() to ensure embedding vector compatibility across mine/search/MCP
sentence-transformers all-MiniLM-L6-v2 produces identical vectors to ChromaDB's default ONNX model — existing palaces remain compatible
Follows project principles: local-first, zero API by default, verbatim storage

Test plan

All 19 tests pass (pytest tests/ -v)
Ruff format and lint clean
mempalace mine <dir> --device cuda uses GPU
mempalace mine <dir> --device cpu falls back to batched CPU
mempalace update <dir> detects new/changed/deleted files
mempalace search works with GPU-embedded vectors
Uninstall sentence-transformers → graceful fallback to ONNX default
No new required dependencies — base install unchanged

- Run ruff format across mempalace/ and tests/ - Fix multi-imports in test_config.py (split to separate lines) - Fix unused variable in test_embeddings.py (add tautological assert) - Add docstrings to all public functions in embeddings.py - Use flush_batch() return value for total_drawers count in mine() - Extract room from drawer metadata instead of double detect_room() call - Skip collection creation during dry-run in update() - Remove dead add_drawer() function from miner.py - Cache resolved device instead of preference string in embeddings.py

phobicdotno added 11 commits April 9, 2026 09:34

feat: add shared embeddings module with CUDA support and batch flush

930c2fb

feat: add gpu optional dependency group (sentence-transformers + torch)

de30ea8

feat: add device config property for GPU/CPU embedding selection

70dc958

feat: batch embedding + shared GPU-aware collection in miner

2eae8a3

feat: batch embedding + shared GPU-aware collection in convo_miner

896b4bd

feat: use shared GPU-aware embeddings in searcher

50af8cb

feat: use shared GPU-aware embeddings in MCP server

b996f91

feat: use shared GPU-aware embeddings in layers and palace_graph

11df6cc

feat: add --device CLI flag for GPU/CPU embedding selection

9df29a2

feat: add update command for incremental palace sync

2772b8c

phobicdotno closed this Apr 9, 2026

phobicdotno mentioned this pull request Apr 10, 2026

feat: GPU-accelerated embeddings via optional sentence-transformers #515

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: GPU-accelerated embeddings, batch processing, and incremental update#351

feat: GPU-accelerated embeddings, batch processing, and incremental update#351
phobicdotno wants to merge 11 commits intoMemPalace:mainfrom
phobicdotno:main

phobicdotno commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

phobicdotno commented Apr 9, 2026

Summary

Changes

New files

Modified files

Architecture notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant