Skip to content

bug: embedding model mismatch — MCP server uses MiniLM (384-dim) while ingest can use mpnet (768-dim) #903

@bensig

Description

@bensig

Problem

The MCP server relies on ChromaDB's built-in default embedding function (all-MiniLM-L6-v2, 384 dimensions). There is no centralized embedding model configuration — each code path makes its own choice.

If a user ingests their palace with all-mpnet-base-v2 (768-dim) — which is what the ingest pipeline documentation and several internal paths assume — then every MCP query silently fails or returns garbage results because the query embeddings (384-dim) don't match the stored embeddings (768-dim).

This also affects anyone following issue #515 (GPU-accelerated embeddings via sentence-transformers) or #756 (OpenAI embeddings) — any non-default model used at ingest time will break MCP search.

Expected behavior

A single source of truth for which embedding model the palace was built with. The MCP server, CLI search, and all ingest paths should resolve to the same model.

Proposed fix

Centralized embedding config with 3-tier resolution:

  1. Palace-level{palace_path}/palace_meta.json stores the model used at build time
  2. Global config~/.mempalace/config.json for user preference
  3. Defaultall-mpnet-base-v2 (768-dim)

mcp_server.py reads from this config instead of relying on ChromaDB's default. Ingest pipelines stamp the model into palace_meta.json at build time so the palace is self-describing.

This also resolves #860 (poor cross-room search with MiniLM) since mpnet produces significantly better embeddings for this use case — benchmarks show +3.5pp on LoCoMo R@10 with bge-large-en-v1.5 over MiniLM, and mpnet sits between the two.

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/mcpMCP server and toolsarea/searchSearch and retrievalbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions