Skip to content

mempalace_list_wings / mempalace_list_rooms / mempalace_get_taxonomy silently return empty results on large collections #338

@wuxiwei

Description

@wuxiwei

Labels: bug

Description

When a palace contains a large number of drawers (~180k+), the MCP tools mempalace_list_wings, mempalace_list_rooms, and mempalace_get_taxonomy all return empty {} results. The same issue affects the wings and rooms fields in mempalace_status.

Meanwhile, mempalace_search works correctly and returns drawers with valid wing and room metadata.

Root Cause

All three functions (and the wings/rooms portion of tool_status) use the same pattern in mcp_server.py:

def tool_list_wings():
    col = _get_collection()
    wings = {}
    try:
        all_meta = col.get(include=["metadatas"])["metadatas"]  # loads ALL records at once
        for m in all_meta:
            w = m.get("wing", "unknown")
            wings[w] = wings.get(w, 0) + 1
    except Exception:
        pass  # silently swallows any error
    return {"wings": wings}

Two problems:

  1. col.get() without a limit attempts to load all records into memory at once. On large collections (180k+ drawers), this either triggers a ChromaDB HNSW internal error or causes excessive memory usage / timeout.

  2. except Exception: pass silently swallows the error, making it appear as if the palace simply has no wings/rooms, with no indication that an error occurred.

Steps to Reproduce

  1. Mine a large codebase into the palace (e.g., mempalace mine ~/large-project resulting in 100k+ drawers)
  2. Call mempalace_list_wings via MCP → returns {"wings": {}}
  3. Call mempalace_list_rooms via MCP → returns {"wing": "all", "rooms": {}}
  4. Call mempalace_get_taxonomy via MCP → returns {"taxonomy": {}}
  5. Call mempalace_status via MCP → returns total_drawers: 183895 but wings: {} and rooms: {}
  6. Call mempalace_search with any query → returns results with correct wing and room metadata

Environment

  • mempalace version: 3.0.0
  • chromadb version: latest via pip
  • OS: macOS (Darwin arm64)
  • Python: 3.9
  • Collection size: ~183,895 drawers

Observed Error (when running manually outside the try/except)

chromadb.errors.InternalError: Error executing plan: Error sending backfill request 
to compactor: Failed to apply logs to the hnsw segment writer

This error is completely hidden by the except Exception: pass in the current code.

Suggested Fix

Option A — Pagination: Use col.get() with limit and offset to paginate through records in batches:

def tool_list_wings():
    col = _get_collection()
    if not col:
        return _no_palace()
    wings = {}
    batch_size = 5000
    offset = 0
    total = col.count()
    while offset < total:
        try:
            batch = col.get(include=["metadatas"], limit=batch_size, offset=offset)
            for m in batch["metadatas"]:
                w = m.get("wing", "unknown")
                wings[w] = wings.get(w, 0) + 1
            offset += batch_size
        except Exception as e:
            return {"wings": wings, "error": f"Partial result, failed at offset {offset}: {str(e)}"}
    return {"wings": wings}

Option B — Metadata-only aggregation: If ChromaDB supports metadata-only queries without triggering HNSW, use that path instead of loading full records.

Additionally: Replace except Exception: pass with at minimum a logged warning, or return an "error" field in the response so callers know something went wrong (see Issue 2).


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions