feat: add native Cursor SQLite convo ingestion#287
feat: add native Cursor SQLite convo ingestion#287saschabuehrle wants to merge 1 commit intoMemPalace:mainfrom
Conversation
PR Review: feat: add native Cursor SQLite convo ingestionExecutive Summary
Affected Areas: Business Impact: Users can now point Flow Changes: Ratings
PR Health
High Priority Issues🐛 #1: Recursive
|
web3guru888
left a comment
There was a problem hiding this comment.
This is a really useful addition — Cursor's state.vscdb stores a ton of valuable conversation context that would otherwise require manual export. The implementation is thoughtful:
- SQLite signature detection (
_looks_like_sqlite) avoids blindly opening every .db file as a conversation source - Heuristic filtering in
_is_cursor_sqlite_candidate(checking for "cursor" and "workspacestorage" in path parts) prevents accidental ingestion of unrelated SQLite databases - Adjacent dedup in the extracted messages handles overlapping nested structures cleanly
- Read-only mode (
?mode=ro) is the right call for not corrupting the workspace DB
The recursive walk() approach for extracting messages from the nested composer payload is robust against schema changes in Cursor's internal format.
Nice work expanding the ingestion surface. Pairs well with #276 (your .rst PR).
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
|
Conflicts with main. Cursor SQLite ingestion is a good feature — would you be able to rebase against current main? |
|
Uh, @bensig ... you closed #232 saying this takes care of it. And immediately closed this also because it needs to be rebased? Things that just need to be rebased shouldn't be closed. They should be left open until they are actually addressed. By closing issues like this you are telling the people who did the work "Whatever I don't want help" even if that is not your intent. Also, insisting people rebase everything so all you have to do is press merge is not conducive to people being willing to continue making PRs in a fast moving project. Instead of closing the issues, why don't you or Milla or those with something to gain from the project succeeding tell AI to look at the things you do want to add and fix those PRs so they can merge cleanly? You are asking people "would you rebase this?" That's common, but they could ask the same of you back. They made a PR that does something useful. Why not utilize it? |
Summary
This PR adds native Cursor SQLite conversation ingestion so users can point
mempalace mine --mode convosdirectly at Cursor workspace storage and ingest composer chats without manual export.What changed
normalize.pySQLite format 3)ItemTablerows wherekey = 'composer.composerData'allComposerspayloads and normalize user/assistant turns into transcript formatconvo_miner.py.vscdb,.sqlite,.db)state.vscdb, CursorworkspaceStoragepaths)Why
Implements issue #274 by enabling local-first, verbatim extraction of Cursor composer history from
state.vscdb.Tests
Added:
tests/test_normalize_cursor.pycomposer.composerDatatests/test_convo_scan.pystate.vscdb.dbfilesRun locally:
ruff check mempalace/normalize.py mempalace/convo_miner.py mempalace/cli.py tests/test_normalize_cursor.py tests/test_convo_scan.pypytest tests/test_normalize.py tests/test_normalize_cursor.py tests/test_convo_scan.py tests/test_convo_miner.py -q✅Note: full
pytest tests/ -vcurrently fails on this machine in existing search/mcp tests due an upstream ONNX/CoreML runtime provider error, unrelated to these changes.Fixes #274
Greetings, saschabuehrle