feat: add OpenAI Codex CLI JSONL normalizer#5
Closed
dpschen wants to merge 1 commit intoMemPalace:mainfrom
Closed
Conversation
Parse Codex session JSONL files with response_item/payload structure. Extracts user and assistant messages from Codex's nested payload format where role is 'user' or 'assistant' inside payload.role, matching the same structure as the existing Claude Code parser but adapted for Codex.
Author
|
sry this wasn’t ready |
2 tasks
This was referenced Apr 7, 2026
igorls
added a commit
to igorls/mempalace
that referenced
this pull request
Apr 7, 2026
- Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: MemPalace#12 (HIGH), MemPalace#5 (MEDIUM), MemPalace#15 (LOW) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
igorls
added a commit
to igorls/mempalace
that referenced
this pull request
Apr 7, 2026
- Remove palace_path from _no_palace() error response (prevents leaking filesystem paths to the LLM) - Replace str(e) with generic 'Internal tool error' in MCP dispatch catch block (full error is still logged server-side via stderr) - Replace sys.exit(1) with return in searcher.search() CLI function (prevents process termination if called from library context) - Remove unused sys import from searcher.py Findings: MemPalace#12 (HIGH), MemPalace#5 (MEDIUM), MemPalace#15 (LOW) Includes test infrastructure from PR MemPalace#131. 92 tests pass.
This was referenced Apr 8, 2026
This was referenced Apr 8, 2026
4 tasks
5 tasks
This was referenced Apr 9, 2026
This was referenced Apr 10, 2026
Closed
brandonhon
added a commit
to brandonhon/mempalace
that referenced
this pull request
Apr 10, 2026
…tion Addresses PR MemPalace#548 review feedback about scan amplification on large palaces. The previous implementation made up to six ChromaDB scans per clean operation: 1. count_drawers(drawers_col) — scan MemPalace#1 2. count_drawers(compressed_col) — scan MemPalace#2 3. delete_drawers(drawers_col) — internal get(where=...) scan MemPalace#3 4. delete_drawers(drawers_col) — delete(where=...) scan MemPalace#4 5. delete_drawers(compressed_col) — scan MemPalace#5 6. delete_drawers(compressed_col) — scan MemPalace#6 Each call was doing its own metadata filter over the collection, meaning a 100K-drawer palace paid the filter cost six times for a single cleanup. This refactor drops it to exactly two scans — one per collection — regardless of palace size. Changes: * palace.py: introduce `find_drawer_ids(col, wing, room)` which returns the matching ID list in a single `get(where=..., include=[])` call. ChromaDB fetches only IDs — no documents, embeddings, or metadatas — so the scan is as cheap as ChromaDB can make it. * palace.py: `count_drawers` is now a thin wrapper around `find_drawer_ids`. The old standalone `delete_drawers` helper is removed because its count-then-delete pattern is exactly what we are trying to avoid. * cli.py::cmd_clean: call `find_drawer_ids` once per collection, reuse the returned ID lists for both the preview counts and the subsequent delete. Deletes go through `col.delete(ids=[...])` which is an O(n) primary-key delete, not another metadata scan. Empty lists are guarded to stay compatible with ChromaDB versions that reject `delete()` with no ids or where filter. Why two scans and not one: ChromaDB collections are independent — there is no cross-collection query. `mempalace_drawers` and `mempalace_compressed` must each be filtered separately. A single-scan variant would have to assume that compressed IDs are a strict subset of drawer IDs and delete compressed by the drawer IDs we already found, but `tool_delete_drawer` in mcp_server.py does not cascade to compressed, so real palaces can contain orphaned compressed rows whose drawer is already gone. Going to one scan would silently leak those orphans. Two scans is the minimum that preserves correctness. Tests: * New `test_find_drawer_ids_*` unit tests cover wing-only, wing+room, and no-match cases. * New `test_find_drawer_ids_single_scan` monkey-patches `collection.get` to assert exactly one call. * New `test_clean_scans_each_collection_exactly_once` is a regression test that wraps `chromadb.PersistentClient` and counts `where`-filtered `get()` calls per collection during a full `cmd_clean` invocation, failing if either collection is scanned more than once. * The existing 15 CLI black-box tests stay identical — the behavior is unchanged, only the scan count dropped. Full suite: 551 passed (was 549, +2 new perf regression tests). Also in this commit: `-V` / `--version` flags, because every good app needs a version flag and we somehow shipped three minor releases without one. The installed version is now embedded in the `-h` description line too, so `mempalace -h` answers "what am I running?" without a separate invocation.
jphein
referenced
this pull request
in jphein/mempalace
Apr 12, 2026
Document Claude Code's two memory layers (auto-memory flat files vs MemPalace archive) and correct Auto Dream status — it's unreleased code behind a disabled feature flag, not a shipped feature. TODO #4 (decay) and #5 (feedback) remain full priority. Co-Authored-By: Claude Opus 4.6 <[email protected]>
OmkarKirpan
added a commit
to OmkarKirpan/mempalace
that referenced
this pull request
Apr 15, 2026
- ChromaBackend.create_collection() now accepts embedding_function and embedding_model_name params - cli.py repair, repair.py rebuild_index: read embedding model from existing collection before delete/recreate, preserve it - migrate.py: stamp new_palace_model() on migrated palaces - palace.get_collection(): accept optional config param so CLI mining respects config.json embedding_model setting - Update test_rebuild_index_success to verify new embedding args Addresses code review findings MemPalace#4, MemPalace#5, MemPalace#7 for MemPalace#903
rusel95
added a commit
to rusel95/mempalace
that referenced
this pull request
Apr 15, 2026
Addresses review items MemPalace#5 and MemPalace#6 from @igorls: 1. Extract core sync logic from cmd_sync (~200 lines) into mempalace/sync.py as sync_palace(...) returning a SyncReport dataclass. cmd_sync is now a thin CLI wrapper. Makes sync callable from MCP tools, tests, and future change-detection features (PII Guard, KG sync). 2. Replace direct chromadb.PersistentClient calls in _force_clean and cmd_sync with ChromaBackend.get_collection. All storage access now goes through the backend abstraction. _force_clean is also now a thin wrapper around sync.force_clean. 3. Document mempalace_sync_status in website/reference/mcp-tools.md so it passes test_no_undocumented_tools. Also ran ruff format with CI-pinned 0.4.x. All 956 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
felipetruman
added a commit
to felipetruman/mempalace
that referenced
this pull request
Apr 17, 2026
…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓
5 tasks
felipetruman
added a commit
to felipetruman/mempalace
that referenced
this pull request
Apr 17, 2026
…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓
igorls
pushed a commit
to felipetruman/mempalace
that referenced
this pull request
Apr 25, 2026
…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓
6 tasks
lealvona
pushed a commit
to lealvona/mempalace
that referenced
this pull request
Apr 29, 2026
…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓
rergards
pushed a commit
to rergards/mempalace
that referenced
this pull request
Apr 30, 2026
Merge PR MemPalace#5 after local merge rehearsal, full lint/format, full pytest, GitHub checks, and LanceDB schema smoke passed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Adds support for parsing OpenAI Codex CLI session files (JSONL format stored at
~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl).How it works
Codex uses a different structure than Claude Code:
{type: "response_item", payload: {type: "message", role: "user"|"assistant", content: [...]}}rolefrompayload.roleand content frompayload.contentMatches the existing Claude Code parser structure — same line-by-line JSONL approach, same content extraction, same transcript output format.
Why it matters
Users accumulate hundreds of Codex sessions locally (each containing real conversations, decisions, and debugging sessions). This lets mempalace mine all of them directly.
Testing
~/.codex/sessions/