Skip to content

fix(graph): normalize wing slug at init so topic tunnels fire for hyphenated dirs (#1194)#1195

Merged
igorls merged 3 commits intodevelopfrom
fix/wing-name-normalization-tunnels
Apr 27, 2026
Merged

fix(graph): normalize wing slug at init so topic tunnels fire for hyphenated dirs (#1194)#1195
igorls merged 3 commits intodevelopfrom
fix/wing-name-normalization-tunnels

Conversation

@bensig
Copy link
Copy Markdown
Collaborator

@bensig bensig commented Apr 25, 2026

Fixes #1194.

Problem

mempalace init recorded the topics_by_wing registry key under the raw directory name (e.g. mempalace-public), while mempalace.yaml's wing field used the lower-cased + separator-collapsed slug (mempalace_public). At mine time the miner read the slug from the yaml and queried the registry under the wrong key, so _compute_topic_tunnels_for_wing returned 0 silently. Every project whose dirname contained a - or space lost every topic tunnel — the common shape in the wild.

Confirmed in a 5-project end-to-end repro (see #1194 for the table): expected 7 tunnels, got 2 (only the no-separator pair). Direct repro of the lookup miss:

```python

from mempalace.miner import get_topics_by_wing
from mempalace.palace_graph import topic_tunnels_for_wing
m = get_topics_by_wing()
len(topic_tunnels_for_wing('mempal_private', m, min_count=1)) # what mine actually calls
0
len(topic_tunnels_for_wing('mempal-private', m, min_count=1)) # what's in topics_by_wing
5
```

Fix

Extracted the normalization rule into config.normalize_wing_name() and routed both call sites through it:

  • mempalace/cli.py:146 — registry write at init time
  • mempalace/room_detector_local.py:307mempalace.yaml write

The two paths now produce the same key by construction.

Tests

  • tests/test_config.py — 4 unit tests for normalize_wing_name (hyphen, space, already-clean, mixed).
  • tests/test_cli.py::test_cmd_init_normalizes_wing_name_for_topics_registry — regression test asserting add_to_known_entities is called with the normalized slug when the dirname is hyphenated. This test fails on develop without the fix.
  • Full suite: 1310 passed (+5 new).
  • ruff check + ruff format --check clean for the changed files.

Why the original tests missed it

tests/test_palace_graph_tunnels.py and tests/test_miner.py topic-tunnel fixtures use wing_alpha, wing_beta, foo, bar — names that look identical before and after normalization. Hyphenated names were never exercised at the init→mine seam.

cc @igorls

@bensig bensig requested a review from igorls April 25, 2026 09:47
@bensig bensig requested a review from milla-jovovich as a code owner April 25, 2026 09:47
@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, miner.load_config() falls back to the raw directory basename when mempalace.yaml is missing, but cmd_init now records topics_by_wing under a normalized wing slug, so _compute_topic_tunnels_for_wing() will return 0 due to a key miss in fallback-config scenarios.

Severity: action required | Category: correctness

How to fix: Normalize fallback wing in load_config

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

cmd_init now writes topics_by_wing using normalize_wing_name(dir_basename), but when mempalace.yaml is missing the miner’s load_config() fallback still uses the raw resolved_project_dir.name. This causes _compute_topic_tunnels_for_wing() to miss the registry key and silently skip topic tunnel creation.

Issue Context

  • cmd_init() uses normalized wing for registry writes.
  • room_detector_local writes normalized wing into mempalace.yaml.
  • But mining without a config file uses a raw wing fallback.

Fix Focus Areas

  • mempalace/miner.py[277-306]
  • (optional) add a regression test covering mining without mempalace.yaml when dirname contains hyphen/space

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Found by Qodo code review. FYI, Qodo is free for open-source.

@bensig
Copy link
Copy Markdown
Collaborator Author

bensig commented Apr 26, 2026

Self-verification (since GH won't let me approve my own PR):

`pytest tests/test_cli.py tests/test_config.py` — 85 passed on this branch.

What I checked

  • Single source of truth. `normalize_wing_name()` is now defined once in `config.py` and called from both `cli.py` (`cmd_init` writing `topics_by_wing`) and `room_detector_local.py` (writing `mempalace.yaml`'s `wing` field). Refactored the previously-inline normalization in `room_detector_local.py` to use the shared helper. The `miner` already reads the wing slug from `mempalace.yaml`, so all three writers now agree.
  • Regression test. `test_cmd_init_normalizes_wing_name_for_topics_registry` exercises a literal `my-cool-app` dir and asserts `my_cool_app` lands in `topics_by_wing` keys. Fails on the pre-fix code, passes on this branch.
  • CHANGELOG entry with the failure mode named explicitly so anyone investigating after the fact sees why hyphenated dirs were silently losing tunnels.

Coordination with #1197

wahajahmed010 has a complementary PR at the lookup layer in `palace_graph.py`. Both should land — this one prevents bad keys from being written, theirs recovers existing palaces with bad keys already on disk. Posted a coordination ask on #1197 suggesting they swap their local `_normalize_wing` for `from .config import normalize_wing_name` after this lands, so we end up with a single source of truth.

Whoever-isn't-me reviews this

Ready for second eyes. Igor or anyone with write access — small surface, covered by tests, low risk.

@igorls
Copy link
Copy Markdown
Member

igorls commented Apr 26, 2026

One small ask before merge: mempalace/convo_miner.py:397 still does the same .lower().replace(" ", "_").replace("-", "_") inline. Now that normalize_wing_name() exists, folding it in keeps all wing-slug producers on one rule and prevents future skew:

from .config import normalize_wing_name
...
wing = normalize_wing_name(convo_path.name)

No behavior change — same output for every input — but the next time the rule shifts, there's only one place to edit.

@igorls
Copy link
Copy Markdown
Member

igorls commented Apr 27, 2026

Pushed c3ee090 covering the two outstanding asks on this PR:

  1. miner.load_config no-yaml fallback — was returning the raw dirname as the wing while cmd_init writes topics_by_wing under the normalized slug. Same key-miss class as Topic tunnels silently skipped for wings with hyphenated dir names #1194, just down the no-yaml branch. Now routed through normalize_wing_name. Flagged by Qodo on the original review.
  2. convo_miner.py:397 — folded the inline .lower().replace(" ", "_").replace("-", "_") through normalize_wing_name. Pure consolidation, no behavior change for any input. All four wing-slug producers (cmd_init, room_detector_local, miner.load_config fallback, convo_miner) now share one rule.

Added test_load_config_no_yaml_normalizes_hyphenated_wing to lock the fallback path. Full suite green: 1311 passed; ruff check + format clean.

Background fork on a worktree off MemPalace/mempalace's branch — the PR is ready to merge once you give it the second-eye.

bensig and others added 3 commits April 27, 2026 03:12
…henated dirs (#1194)

`init` was recording `topics_by_wing[<raw-dirname>]` while `mempalace.yaml`
got the lower-cased separator-collapsed slug. At mine time the miner
read the slug from the yaml and missed the registry key, so
`_compute_topic_tunnels_for_wing` returned 0 silently for every project
whose folder contained a `-` or a space — the most common shape in the
wild.

Extracted the rule into `config.normalize_wing_name()` and routed both
`cli.cmd_init` (registry write) and `room_detector_local.detect_rooms_local`
(yaml write) through it. Added a regression test in `test_cli.py`
asserting the registry call uses the normalized slug, plus four direct
unit tests for the helper.

Refs #1180.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…1194)

Two follow-ups against the review on this PR:

1. ``miner.load_config`` no-yaml fallback was returning the raw dirname
   as the wing, while ``cmd_init`` writes ``topics_by_wing`` under the
   normalized slug. A hyphenated project mined without a ``mempalace.yaml``
   file silently lost every topic tunnel — same key-miss class as #1194,
   just down the no-yaml branch (raised by Qodo on this PR).

2. ``convo_miner`` was applying the lower/replace rule inline at one
   call site. Now folded through ``normalize_wing_name`` so all wing-slug
   producers — ``cmd_init``, ``room_detector_local``, ``miner.load_config``
   fallback, ``convo_miner`` — share a single source of truth. No
   behavior change for any input; pure consolidation.

Added ``test_load_config_no_yaml_normalizes_hyphenated_wing`` to lock
the fallback path to the normalized slug — fails on develop without
the miner change.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
cmd_init now invokes ``_run_pass_zero`` unconditionally (#1221, #1223
landed on develop after this PR's branch point). The pass reads sample
content via ``builtins.open``; with that mocked to MagicMock, the
downstream ``"\\n\\n".join(samples)`` in
``corpus_origin.detect_origin_heuristic`` raises
``TypeError: expected str instance, MagicMock found``.

This test only cares about the wing-slug write to the registry, so
stub the pass-zero call directly rather than try to satisfy its full
sample-gathering contract.
@igorls igorls force-pushed the fix/wing-name-normalization-tunnels branch from c3ee090 to cfca40c Compare April 27, 2026 06:14
@igorls igorls merged commit f80c9ff into develop Apr 27, 2026
6 checks passed
@igorls igorls mentioned this pull request Apr 27, 2026
jphein added a commit to jphein/mempalace that referenced this pull request Apr 27, 2026
22 commits from upstream/develop, including:

- HNSW capacity divergence detection + BM25-only sqlite fallback when
  vector layer is unloadable (MemPalace#1222 / MemPalace#1227 — vector_disabled kwarg
  routes search through chromadb-bypass path on segfault risk).
- HNSW index bloat prevention via batch_size + sync_threshold metadata
  on collection create (MemPalace#1191).
- Hooks: always mine the active transcript as convos, additive to
  MEMPAL_DIR (MemPalace#1230 / MemPalace#1231). Restructured into `_get_mine_targets()`
  list approach + separate `_ingest_transcript()`. Daemon-strict gate
  preserved on each entry point.
- Wing-name normalization for hyphenated dirs across miner / convo_miner
  / palace_graph (MemPalace#1194 / MemPalace#1195 / MemPalace#1197).
- `narrow _fix_blob_seq_ids` shim + `repair --mode max-seq-id` for
  legacy 0.6.x BLOB-poisoned palaces (MemPalace#1135).

Conflict resolution:
- README.md: kept fork-shaped narrative, dropped upstream's sweep tip
  injection (per fork-readme handling memory).
- hooks/README.md: adopted upstream's accurate `MEMPAL_DIR` additive
  description; kept `MEMPAL_PYTHON` env-var name (matches actual
  `hooks/*.sh` scripts in both forks).
- mempalace/cli.py: consolidated duplicate `--mode` argparse declarations
  into one with all 4 choices (rebuild/legacy/reorganize/max-seq-id).
- mempalace/hooks_cli.py: adopted upstream's `_get_mine_targets()` +
  `_ingest_transcript()` shape; added `_daemon_strict()` guard at entry
  of `_maybe_auto_ingest`, `_mine_sync`, and `_ingest_transcript` so
  the daemon-strict architecture still skips local writes.
- mempalace/mcp_server.py: kept both `kind=` and `vector_disabled=`
  kwargs on the `search_memories` call.
- mempalace/searcher.py: kept fork's `_count_in_scope`,
  `_sqlite_fallback_and_scope`, `_apply_kind_text_filter` AND upstream's
  `_bm25_only_via_sqlite`. Both `kind` and `vector_disabled` parameters
  on `search_memories`.

Tests: 1510 passing (up from 1366 — upstream brought new test suites).
xcarbo added a commit to xcarbo/mempalace that referenced this pull request May 4, 2026
Catches up on a month of upstream work. Highlights pulled in:

- MemPalace#1306 searcher: hybrid candidate union (vector ∪ BM25 reranking pool)
- MemPalace#1325 mcp security: omit absolute paths from tool_get_drawer / tool_status
- MemPalace#1322 chroma: wire quarantine_stale_hnsw to prevent SIGSEGV on stale HNSW
- MemPalace#1320 mcp: forward valid_to / source params in kg_add / kg_invalidate
- MemPalace#1321 cli: honor --palace flag in cmd_init
- MemPalace#1314 kg temporal params fix
- MemPalace#1244 cli: cmd_compress writes to mempalace_closets so palace can read
- MemPalace#1243 mcp: case-insensitive agent name in diary_write/diary_read
- MemPalace#1303 mcp_server: pass embedding_function= on collection reopen
- MemPalace#1076/MemPalace#1077 hooks: quote CLAUDE_PLUGIN_ROOT / CODEX_PLUGIN_ROOT in hooks.json
- Various ruff format passes on touched files

Conflict resolution (CHANGELOG.md only — code files all auto-merged):

- 3.3.5 unreleased section header from upstream kept above 3.3.4
- 3.3.4 section: kept our 2026-04-30 release date; merged upstream's new
  MemPalace#1299 SIGSEGV-on-default-EF entry in alongside our existing topic-tunnels
  (MemPalace#1194/MemPalace#1195/MemPalace#1197), HNSW-bloat (MemPalace#1191), max_seq_id (MemPalace#1135), and
  auto-ingest (MemPalace#1230/MemPalace#1231) entries. Kept our richer topic-tunnels detail
  (upstream's version was a strict subset).

xdev patches preserved (still on this branch, untouched by merge):
- 6ef44cb fix(hooks): route CC transcripts via convo_miner with cwd-based wings
- 3fad61d fix(config): allow leading dash in wing names

Not pushed to origin — run tests locally and decide when to push.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Topic tunnels silently skipped for wings with hyphenated dir names

3 participants