Add Cursor and factory.ai session mining and improve local search resilience#702
Closed
mfhens wants to merge 17 commits intoMemPalace:developfrom
Closed
Add Cursor and factory.ai session mining and improve local search resilience#702mfhens wants to merge 17 commits intoMemPalace:developfrom
mfhens wants to merge 17 commits intoMemPalace:developfrom
Conversation
Parse Factory.ai/Droid session files (JSONL, session_start fingerprint). Strip <system-reminder> injection blocks from user content. Skip tool_use and thinking blocks in assistant content. Register after Copilot CLI parser to avoid false matches. Add 7 tests (16/16 passing). Also fix UnicodeEncodeError on Windows cp1252 by replacing the -> arrow in convo_miner.py dry-run print statements. Co-authored-by: Copilot <[email protected]>
Add 'mempalace kg' with four sub-actions: - kg add <subject> <predicate> <obj> [--source] [--from] [--confidence] - kg query <entity> [--as-of] [--direction] - kg timeline [<entity>] - kg stats Routes via two-level dispatch (same pattern as 'hook' and 'instructions'). Also restore accidentally dropped 'def cmd_repair(args):' function header. Co-authored-by: Copilot <[email protected]>
Allows pointing kg subcommands at a specific sqlite3 file, enabling the KG to live in a project repo and be tracked in git. Usage: mempalace kg --kg ./mempalace/knowledge_graph.sqlite3 add ... Co-authored-by: Copilot <[email protected]>
The checkmark character (U+2713) in the progress line caused UnicodeEncodeError on Windows terminals using cp1252 encoding. Co-authored-by: Copilot <[email protected]>
Routes .py files through chunk_python_ast() which emits one chunk per top-level function, class, and method. Each chunk carries symbol_type, symbol_name, and parent_symbol metadata for filtered retrieval. Falls back to chunk_text() on SyntaxError or empty AST. feat(search): min_similarity threshold parameter search_memories() and the MCP tool_search() accept min_similarity (default 0.0, backward-compatible). Results below threshold are filtered post-query. Co-authored-by: Copilot <[email protected]>
Adds cursor_miner.py to extract user/assistant exchange pairs from Cursor's store.db SQLite files (~/.cursor/chats/**/<session>/store.db). Key design: - Extracts <user_query> tags from user messages; skips system-context blobs - Caps assistant text at 1500 chars per exchange for focused retrieval - Uses rowid order for message sequence (SQLite insertion order) - Copies DB to temp file before reading to avoid Cursor file locks - Stores session_name and workspace_hash as metadata for filtering - Dedup by source_key = 'cursor:<workspace_hash>/<session_hash>' CLI: mempalace mine ~/.cursor/chats --mode cursor [--wing my_wing] Co-authored-by: Copilot <[email protected]>
Adds two equivalent scripts (PowerShell + bash) that mine all three AI chat sources in a single command. cursor → ~/.cursor/chats --mode cursor --wing cursor_chats copilot → ~/.copilot/session-state --mode convos --wing copilot_sessions factory → ~/.factory/sessions --mode convos --wing factory_sessions Both scripts support --dry-run and selective source args. Co-authored-by: Copilot <[email protected]>
- knowledge_graph: replace shared SQLite connection with thread-local storage (threading.local) to fix data-corruption risk in concurrent MCP server contexts; remove check_same_thread=False - mcp_server: hash full content (not content[:100]) for drawer_id to eliminate silent ID collisions between entries sharing a common prefix - hooks/mempal_save_hook.sh: guard TRANSCRIPT_PATH with -n before -f to prevent confusing errors when sanitization yields an empty path Co-authored-by: Copilot <[email protected]>
Includes new palace.py module, expanded mcp_server tools, knowledge graph refactor, extensive new test suite, CI Windows job, and integrations/openclaw SKILL.md. Fixes applied before merge: - Thread-safe SQLite connections in KnowledgeGraph (threading.local) - Full-content hashing for drawer IDs in mcp_server - TRANSCRIPT_PATH empty-string guard in mempal_save_hook.sh Conflict resolution: - tests/test_miner.py: merged chunk_python_ast tests (main) with file_already_mined tests (PR) - tests/test_normalize.py: merged copilot/factory normalizer tests (main) with PR's extended normalize test suite; combined imports - tests/test_searcher.py: merged min_similarity tests (main) with PR's query-error and filter tests Co-authored-by: Copilot <[email protected]>
Expand transcript ingestion with Cursor chat mining and broader normalizers so more local AI sessions can be filed into the palace. Harden Chroma collection access and search ranking so builds and tests pass reliably without network model downloads. Made-with: Cursor
Mem palace/main
bensig
added a commit
that referenced
this pull request
Apr 18, 2026
Draft plugin specification for source adapters, mirroring RFC 001's role for storage backends. Formalizes the contract six community ingester PRs (#274, #23, #169, #232, #567, #98, #702) plus #981's metadata-only mode have been reinventing ad-hoc, so adapter authors can build to a stable surface. Key decisions: - Single ingest() method; lazy adapters yield SourceItemMetadata ahead of drawers, eager adapters interleave - Declared-transformation model (§1.4) replaces informal verbatim promise with a verifiable one; byte_preserving adapters declare the empty set, declared_lossy adapters enumerate. Existing miner.py and the convo_miner+normalize pipeline map cleanly - Palace is the incremental cursor via is_current(item, metadata); no sidecar persistence - Routing is adapter-owned; detect_room/detect_hall move into the filesystem adapter - Flat metadata per ChromaDB (RFC 001 §1.4) — entity hints as json_string field, KG triples route to SQLite knowledge graph - Closets stay core-built as a post-step; adapters may emit flat closet_hints. Closes existing gap where convo drawers get no closets - No per-drawer field renames: source_file, filed_at, source_mtime, added_by, normalize_version, entities, ingest_mode all preserved. Spec adds adapter_name, adapter_version, privacy_class §9 enumerates the cleanup PR prerequisites (mempalace/sources/ module, PalaceContext facade, KnowledgeGraph.add_triple gaining backwards-compatible source_drawer_id + adapter_name params). Tracking issue: #989
4 tasks
- palace.py: remove duplicate hashlib/re imports; restore get_client, get_embedding_function, distance_to_similarity, _SafeEmbeddingFunction, and _HashEmbeddingFunction removed by backend refactor - miner.py: fix indentation — add_drawer() call and if added: block were dedented out of the for chunk in chunks: loop - mcp_server.py: add missing import of get_embedding_function and distance_to_similarity from palace - searcher.py: remove duplicate mid-file import of removed palace symbols - knowledge_graph.py: add missing self._local = threading.local() init - tests/conftest.py: remove unused get_client/get_collection imports; add import chromadb for fixture at line 103 - tests/test_miner.py: fix syntax error in multi-name import; replace get_client with import chromadb (used directly in tests) - tests/test_convo_miner.py: remove duplicate shutil import; add get_collection as get_palace_collection import - tests/test_mcp_server.py: remove unused get_collection import Co-authored-by: Copilot <[email protected]>
Mem palace main
# Conflicts: # hooks/mempal_save_hook.sh # mempalace/cli.py # mempalace/layers.py # mempalace/miner.py # tests/test_hooks_cli.py # uv.lock # website/.vitepress/config.mts # website/.vitepress/theme/landing/HeroSection.vue # website/.vitepress/theme/landing/landing.css
jphein
pushed a commit
to jphein/mempalace
that referenced
this pull request
Apr 30, 2026
…Code, MemPalace#274/MemPalace#232 Cursor, MemPalace#169 Pi, MemPalace#702 Cursor+factory.ai) Updates the multi-agent-support bullet to cite the actual upstream work instead of just gesturing at it. RFC 002 itself is PR MemPalace#990 (tracking issue MemPalace#989). Existing third-party prototypes already proposed against the spec: * OpenCode SQLite — PR MemPalace#23 * Cursor SQLite — issue MemPalace#274 * Cursor JSONL (earlier variant) — PR MemPalace#232 * Pi agent JSONL — PR MemPalace#169 * Combined Cursor + factory.ai — PR MemPalace#702 Each becomes a mempalace-source-<agent> package once RFC 002 lands. Names the path explicitly: fork unblocks the pattern by helping land RFC 002; per-agent adapter PRs land from their respective authors. Aider, Gemini CLI, Codex CLI, and Warp are roadmap targets without existing adapter PRs and are listed as such (no fabricated PR refs). https://claude.ai/code/session_01GvwducFnFtN8KYmfbWKMR6
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR expands MemPalace's ingestion and retrieval workflow for local AI usage. It adds first-class Cursor and factory.ai session mining, broadens transcript normalization for additional AI
tools, and makes search more reliable in offline or restricted environments where the default embedding setup may not be available.
mempalace mine --mode cursorand a dedicated Cursor miner for~/.cursor/chatsstore.dbsessionsmin_similarityfilteringmempalace kgsubcommands with optional--kgpath override, plus Bash and PowerShellmine-all-sessionshelpersHow to test
Run mempalace mine --mode cursor, then query mempalace to see the results
Checklist
python -m pytest tests/ -v --ignore=tests/benchmarks(628 passed)ruff check .)