Skip to content

chore: rescue merged stacked PRs #1150 and #1157 into develop#1175

Merged
igorls merged 11 commits intodevelopfrom
chore/rescue-stacked-prs-into-develop
Apr 24, 2026
Merged

chore: rescue merged stacked PRs #1150 and #1157 into develop#1175
igorls merged 11 commits intodevelopfrom
chore/rescue-stacked-prs-into-develop

Conversation

@igorls
Copy link
Copy Markdown
Member

@igorls igorls commented Apr 24, 2026

Problem

#1148, #1150, and #1157 were all approved and merged on GitHub. Their merge commits exist, but only #1148 reached `develop`:

PR Merge commit lives on Reachable from develop?
#1148 `develop`
#1150 `feat/project-scanner-entity-detection` (stale)
#1157 `feat/llm-entity-refine` (stale)

The stacked PRs were merged into their parent feature branches rather than into `develop`. When #1148 was merged into `develop` (direct merge, not a rebase that would have flattened the stack), the children's commits never made the journey.

Release PR #1159 (`develop → main` for v3.3.3) is therefore missing:

  • LLM-assisted entity refinement (convo scanner, provider abstraction, interactive refinement pass)
  • The `add_to_known_entities` wire-up that actually makes init's output reach the miner's drawer-tagging registry

This PR merges the `feat/llm-entity-refine` branch (which still contains the rolled-up #1157#1150 → everything-below history) into `develop` so the 3.3.3 tag ships what was reviewed.

Contents

Brings in 8 commits from the stacked chain:

  • `feat(convo)`: Claude Code conversation directory scanner
  • `feat(llm)`: pluggable provider abstraction (Ollama / openai-compat / Anthropic)
  • `feat(llm)`: interactive refinement with batching, progress, Ctrl-C cancellation
  • `fix(llm)`: word-boundary contexts, JSON extraction, authoritative-source detection
  • `feat(init)`: `--llm` flag + convo_scanner integration
  • `fix(mine)`: skip generated entities file during mining
  • `feat(init)`: wire confirmed entities into `~/.mempalace/known_entities.json`
  • `fix(init)`: registry review feedback

Plus the two GitHub merge commits for PR #1157#1150 (preserves audit trail).

Verification

  • `uv run pytest tests/ --ignore=tests/benchmarks` → 1230 passed
  • `ruff check mempalace/ tests/` → clean
  • `ruff format --check mempalace/ tests/` → clean
  • No file-level conflicts; git's three-way merge resolved cleanly

Why a merge commit, not a rebase

The stacked children have their own merge commits with review history attached. Rebasing would discard that. A `--no-ff` merge preserves: each original commit, the #1150 → parent merge, and the #1157#1150 merge.

Test plan

  • Full test suite passes on the merged tree
  • Ruff + format clean
  • Reviewer verifies the resulting `develop` HEAD contains `mempalace/llm_client.py`, `mempalace/convo_scanner.py`, `mempalace/llm_refine.py`, and `miner.add_to_known_entities` before release: v3.3.3 — sync develop → main for tag cut #1159 is merged to main

igorls and others added 10 commits April 24, 2026 00:46
Claude Code stores sessions under `~/.claude/projects/<slug>/<id>.jsonl`
where `<slug>` is the original CWD with `/` replaced by `-`. That
encoding is lossy — can't distinguish `foo-bar` (one segment) from
`foo/bar` (two) — so slug-decoding alone produces wrong names for any
hyphenated project.

Fortunately, every message record carries a `cwd` field with the true
path. This scanner reads one record per session to recover the
accurate project name deterministically, falling back to slug-decoding
only if the JSONL is malformed or empty.

Output shape matches project_scanner.ProjectInfo so the discover
orchestrator can union results across sources. Session count doubles
as a density signal for ranking.

22 unit tests cover: root detection, cwd extraction with malformed
input tolerance, fallback slug decoding, name resolution using the
newest session (so renames win), and dedup when two encoded dirs
resolve to the same project.
Three providers cover the useful space while keeping the zero-API
default:

- `ollama` (default): local models via http://localhost:11434. Works
  fully offline. Tag-matching check accepts both `model` and
  `model:latest` forms.
- `openai-compat`: any /v1/chat/completions endpoint. Covers
  OpenRouter, LM Studio, llama.cpp server, vLLM, Groq, Together,
  Fireworks, and most self-hosted frameworks. API key falls back to
  $OPENAI_API_KEY. Endpoint normalization is forgiving about trailing
  `/v1`.
- `anthropic`: Messages API v2023-06-01. API key falls back to
  $ANTHROPIC_API_KEY. Concatenates multi-block text responses.

JSON mode is normalized across providers — Ollama uses
`format: "json"`, OpenAI-compat uses `response_format`, Anthropic uses
prompt-level instruction. Callers request JSON once; this module
handles the provider-specific plumbing.

No external SDK dependency; stdlib `urllib` throughout. HTTP errors
are wrapped into a single `LLMError` class so callers don't need to
distinguish transport, auth, and parse failures at the call site.

26 tests, all with mocked HTTP — suite runs offline with no real
provider required.
Takes the candidate set produced by phase-1 detection (manifests, git
authors, regex on prose) and asks an LLM to reclassify each candidate
as PERSON / PROJECT / TOPIC / COMMON_WORD / AMBIGUOUS.

Scale approach: never feed the raw corpus to the LLM. For each
candidate, collect up to 3 context lines from sampled prose, cap each
at 240 chars, batch 25 candidates per call. Keeps total input around
50-100K tokens even on large corpora and completes in a few minutes
on a 4B local model.

Interactive UX:
- Stderr progress bar with the current candidate name, updates
  per-batch.
- Ctrl-C interrupts cleanly: returns a RefineResult with
  `cancelled=True` and whatever was classified before the interrupt.
  The partial result is safe to pass straight to confirm_entities.
- Per-batch errors (transport, parse) are recorded in `errors` and
  don't abort the whole run.

Refinement scope: only `uncertain` and low-confidence `projects`
entries are sent. Manifest-backed projects (conf >= 0.95) and git-
authored people are already authoritative and skip the LLM.

Response parser is defensive — accepts `label` or `type` keys,
lowercase/uppercase variants, top-level list or wrapped object, and
strips markdown code fences. Unknown labels become AMBIGUOUS so the
user reviews them rather than silently accepting a bad classification.

`collect_corpus_text` provides a simple stratified prose sampler
(recent first, capped per-file) so callers don't need to build their
own corpus window.

28 tests with a FakeProvider (no network). Covers context collection,
prompt building, response parsing variants, classification apply,
end-to-end refine, and Ctrl-C partial-result behavior.
Extends the init orchestrator to consume two new signal sources:

1. Claude Code conversation dirs: when the target is a
   `~/.claude/projects/` root, convo_scanner contributes ProjectInfo
   entries alongside the git/manifest projects. Dedup is by name,
   preferring the entry with more user-authored activity.
2. Optional LLM refinement: when --llm is passed, discover_entities
   constructs the provider, validates availability, and runs
   llm_refine.refine_entities on the merged candidates. Status
   summary (reclassified / dropped / cancelled / batch errors)
   prints to stderr.

New init flags (opt-in, default remains zero-API):
- --llm: enable refinement
- --llm-provider: ollama (default) | openai-compat | anthropic
- --llm-model: default gemma4:e4b for Ollama
- --llm-endpoint: URL (required for openai-compat)
- --llm-api-key: falls back to env ($ANTHROPIC_API_KEY or
  $OPENAI_API_KEY depending on provider)

Provider check_available runs before the scan, so the user sees an
immediate error ("Run: ollama pull <model>" or "ANTHROPIC_API_KEY not
set") rather than a mid-scan failure.
…oritative sources

Addresses issues found while reviewing the initial phase-2 implementation
against real data:

**Bug: uncertain bucket starved from the LLM.**
`discover_entities` was dropping the regex-uncertain bucket whenever real
git/manifest signal existed — which is exactly when `--llm` is most useful
for cleaning up prose noise. The uncertain candidates never reached the
refinement step. Fixed: only drop when `llm_provider is None`.

**Context collection: word boundaries, not substring.**
`_collect_contexts` used substring matching on lower-cased lines, so the
name "Go" matched "good", "going", "forgot". Switched to a
`(?<!\w)…(?!\w)` regex so short names only match at token boundaries.

**Authoritative-source detection replaces confidence threshold.**
Previously the refinement step skipped entries with `confidence >= 0.95`
to avoid second-guessing manifest-backed projects. That threshold was
fragile — the regex detector produces 0.99 confidence for things like
`code file reference (5x)` on framework names (OpenAPI, etc.), so those
skipped the LLM despite being regex-only noise. New helpers
`_is_authoritative_person` / `_is_authoritative_project` look at the
actual signal strings (commits, package.json, etc.) to decide.

**Now also refines regex-derived people.**
After #1148's high-pronoun-signal fix, the regex detector can promote
non-people to the `people` bucket (e.g. a capitalized common noun that
happened to appear near pronouns). The LLM now gets a chance to clean
those up, while git-authored people are still skipped.

**Robust JSON extraction.**
Small local models routinely wrap JSON output in prose ("Sure, here's
the classification: {…}"). The previous code-fence stripper failed on
that. `_extract_json_candidates` now does balanced-bracket extraction
with string-aware quote handling, so it recovers JSON from:
- raw responses
- markdown fenced blocks
- JSON embedded inside surrounding text
- multiple candidate objects/arrays

**Prompt guidance for frameworks vs user projects.**
Added an explicit instruction: frameworks, runtimes, APIs, cloud
services, and third-party vendors (Angular, OpenAPI, Terraform, Bun,
Google, etc.) are TOPIC unless the context clearly says it's the user's
own codebase. Directly addresses a false-positive pattern observed
during dev runs.

**Defensive mtime.**
`convo_scanner._safe_mtime` catches OSError during `stat()` — permission
changes, filesystem races, broken symlinks — and sorts the affected file
to the end of the newest-first order rather than crashing the scan.

**Cosmetic:** merged two adjacent f-strings on the same line in
`backends/chroma.py` and `llm_client.py` (no behaviour change).

15 new tests cover the OSError fallback, word-boundary matching, JSON
extraction variants, authoritative-source helpers, refining high-
confidence regex projects, and end-to-end LLM refinement preserving the
uncertain bucket.
…egistry

The init step's output was a dead file. miner.py has always read
`~/.mempalace/known_entities.json` to tag drawer metadata with
recognized names, but nothing ever wrote it — so init's careful
manifest + git + LLM detection work stopped at `<project>/entities.json`
and never reached the path that actually uses it.

Measured delta on a representative prose snippet (eight sentences
mentioning six real people and four real projects):
- Empty registry: 0 entities recognized (multi-word names fail the
  frequency threshold; lowercase/hyphenated project names don't match
  the CamelCase regex).
- Registry populated by init: 12 entities recognized (all correct, zero
  false positives).

Every recognized name becomes a semicolon-separated metadata tag on the
drawer, which ChromaDB uses for entity-filtered search.

Implementation:

- `miner.add_to_known_entities({category: [names]})` reads the existing
  registry, unions each category (case-insensitively, preserving first-
  seen casing), and writes back. The function is tolerant of the two
  on-disk shapes miner already supports: list of names, or dict mapping
  name → code (dialect-style). In the dict case new names are added as
  keys with `None` values so existing codes aren't overwritten.
- Invalidates the in-process mtime cache so same-process callers
  (`cmd_init` → `cmd_mine` in one run) see the write immediately.
- Writes with `ensure_ascii=False` so non-ASCII names (Gergő Móricz,
  Arturo Domínguez, etc.) stay readable on disk.
- Chmods 0o600 — the registry mirrors confirm-step PII from the user's
  git authors and local paths.

cmd_init now calls this at the end of the confirm-entities step, after
the per-project `entities.json` is written (which is kept as an audit
trail the user can inspect or hand-edit). The per-project file is still
excluded from mining via `SKIP_FILENAMES` from the earlier fix.

17 new tests cover: fresh-file creation, list-category union, case-
insensitive dedup, preservation of untouched categories, dict-format
registries, malformed/non-dict file recovery, cache invalidation,
unicode round-trip, and an end-to-end verification that the miner's
`_extract_entities_for_metadata` picks up every registered name.
feat(init): wire confirmed entities into the miner's known-entities registry
#1148, #1150, and #1157 were reviewed and merged on GitHub, but the two
stacked children landed on their parent feature branches (now stale)
rather than on develop. Only #1148's commits reached develop via the
direct merge. Release PR #1159 (develop → main for v3.3.3) is therefore
missing the LLM refinement, Claude-conversation scanner, and miner-
registry wire-up that were ostensibly part of the release.

This merge brings the stale `feat/llm-entity-refine` branch (which
contains the rolled-up merge commit for #1157#1150 → everything
below) into develop so the release tag includes it.

No code changes here — only history recovery.
Copilot AI review requested due to automatic review settings April 24, 2026 16:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores previously approved-but-mismerged stacked PR changes into develop by merging the rolled-up feature branch, ensuring the upcoming release includes LLM-assisted entity refinement, Claude Code convo scanning, and init → miner registry wiring.

Changes:

  • Add optional LLM refinement flow (provider abstraction + refinement pass) and integrate it into mempalace init.
  • Add Claude Code conversation directory scanning as an additional project-signal source.
  • Wire init’s confirmed entities into the miner’s global known-entities registry and skip generated entities.json during mining.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
uv.lock Adds tomli for Python <3.11 to support TOML parsing fallback.
mempalace/project_scanner.py Integrates convo scanning and optional LLM refinement into entity discovery.
mempalace/miner.py Skips entities.json during mining; adds add_to_known_entities registry merge helper.
mempalace/llm_refine.py Implements refinement batching, context collection, parsing, and reclassification logic.
mempalace/llm_client.py Adds provider abstraction for Ollama / OpenAI-compat / Anthropic (stdlib urllib).
mempalace/convo_scanner.py Scans ~/.claude/projects/ sessions and extracts project names via cwd.
mempalace/cli.py Adds init --llm flags and wires discovery → confirm → registry update.
mempalace/backends/chroma.py Minor log message formatting change.
tests/test_project_scanner.py Extends coverage for LLM/refinement behavior in discovery.
tests/test_miner.py Adds regression coverage for skipping generated files.
tests/test_llm_refine.py Adds comprehensive offline tests for refinement logic.
tests/test_llm_client.py Adds offline tests for provider request/response behavior.
tests/test_known_entities_registry.py Adds coverage for registry merging, cache invalidation, and miner recall improvement.
tests/test_convo_scanner.py Adds coverage for convo root detection and project extraction/dedup behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/project_scanner.py Outdated
Comment on lines +631 to +636
# more user_commits as signal strength.
by_name: dict[str, ProjectInfo] = {p.name: p for p in projects}
for cp in convo_projects:
existing = by_name.get(cp.name)
if existing is None or cp.user_commits > existing.user_commits:
by_name[cp.name] = cp
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project deduping for Claude Code convo projects is currently case-sensitive (by_name: {p.name: p} / by_name.get(cp.name)), whereas other parts of the pipeline (e.g. _merge_detected) dedup case-insensitively. This can surface duplicate projects that differ only by casing (e.g. RealProj from a manifest vs realproj from convo cwd). Consider keying by_name by name.lower() (while preserving first-seen casing) or reusing the existing case-insensitive dedup approach used elsewhere.

Suggested change
# more user_commits as signal strength.
by_name: dict[str, ProjectInfo] = {p.name: p for p in projects}
for cp in convo_projects:
existing = by_name.get(cp.name)
if existing is None or cp.user_commits > existing.user_commits:
by_name[cp.name] = cp
# more user_commits as signal strength. Use a case-insensitive key so
# manifest/git-derived names and convo cwd-derived names that differ
# only by casing are merged consistently.
by_name: dict[str, ProjectInfo] = {p.name.lower(): p for p in projects}
for cp in convo_projects:
key = cp.name.lower()
existing = by_name.get(key)
if existing is None or cp.user_commits > existing.user_commits:
by_name[key] = cp

Copilot uses AI. Check for mistakes.
…ources

`discover_entities` was deduping the convo_scanner results against the
manifest/git scan with a case-sensitive key, while every other dedup
path in the pipeline (`_merge_detected`, `miner.add_to_known_entities`)
uses case-insensitive matching. A project named `foo` in a manifest
plus `Foo` as a Claude Code `cwd` variant would surface as two review
entries instead of collapsing to one.

Fix keys `by_name` by `name.lower()` while preserving the first-seen
casing, matching the rest of the pipeline. Flagged by Copilot on #1175.

Regression test asserts a manifest project + a CamelCase-variant convo
cwd for the same real project collapse to one entry.
@igorls igorls merged commit 8a6ebbe into develop Apr 24, 2026
6 checks passed
shrhoads pushed a commit to shrhoads/mempalace that referenced this pull request Apr 24, 2026
Adds entries to the 3.3.3 section for the work that landed via MemPalace#1148,
MemPalace#1150, MemPalace#1157, and MemPalace#1175 (rescued from stacked feature branches into
develop via MemPalace#1175). Without these entries the 3.3.3 release notes on
main would advertise only the hook/diary/search fixes that made it to
develop through the first direct merge.

Covers:
- Manifest + git-author entity detection (MemPalace#1148)
- Regex detector accuracy improvements (MemPalace#1148)
- Optional --llm classification with Ollama / openai-compat / Anthropic
  provider abstraction and interactive UX (MemPalace#1150)
- Claude Code conversation scanner (MemPalace#1150)
- Init → miner registry wire-up so confirmed entities actually reach
  drawer metadata tagging (MemPalace#1157)
- Case-insensitive project dedup across all sources (MemPalace#1175)
- `mempalace mine` skips the generated entities.json artifact
jphein added a commit to jphein/mempalace that referenced this pull request Apr 24, 2026
… state

Three stale sections updated:

- Fork change queue: row 8 (.blob_seq_ids_migrated marker) struck
  through → FILED as MemPalace#1177. Two new rows added for segfault fixes
  discovered today (MemPalace#1171 concurrent-write lock, MemPalace#1173 quarantine in
  make_client) that weren't in the queue because the bugs surfaced
  today, not during the original 2026-04-21 triage.

- Open upstream PRs: was showing 3 of 10 PRs. Now shows all 10 with
  current CI/review state. All rebased onto current upstream/develop
  and MERGEABLE as of today.

- Merged since v3.3.1: added v3.3.3 release (2026-04-24) with its
  constituent merges — MemPalace#942, MemPalace#833, MemPalace#1097, MemPalace#1145, MemPalace#1147, MemPalace#1148/1150/1157
  entity-detection overhaul (via @igorls's MemPalace#1175 stacked-PR rescue),
  MemPalace#1166 palace-path security, MemPalace#340/MemPalace#1093 install regression, plus MemPalace#851
  from the 2026-04-22 batch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants