Skip to content

[Bug] Memory indexer doesn't detect embedding dimension mismatch between stored vectors and active provider #32277

@Asclepeion

Description

@Asclepeion

Problem

When switching memory search providers (e.g., from local embeddinggemma-300M fallback to Gemini gemini-embedding-001), the existing vector database contains embeddings with a different dimension (300 vs 3072). The indexer:

  1. Doesn't detect the dimension mismatch
  2. Doesn't warn the user
  3. Returns zero or garbage results silently — memory_search appears to work but finds nothing relevant

This is a silent data integrity failure that's very hard to diagnose.

Expected Behavior

When the active embedding provider produces vectors with a different dimension than those stored in the database, the system should:

  1. Detect the mismatch on startup or first query
  2. Warn with a clear log message: [memory] Dimension mismatch: stored vectors are 300-dim but active provider (gemini-embedding-001) produces 3072-dim. Re-index required.
  3. Optionally auto-re-index or provide a CLI command: clawdbot memory reindex

Steps to Reproduce

  1. Configure memorySearch.provider: 'gemini' with remote.apiKey
  2. Let the system fall back to local provider (e.g., API unavailable)
  3. Accumulate memory entries with local embedding dimensions
  4. Restore Gemini access — queries now use 3072-dim vectors against 300-dim stored vectors
  5. memory_search returns results with very low scores or mismatched content

Environment

  • Clawdbot v2026.1.24-3
  • SQLite memory backend (main.sqlite)
  • Gemini gemini-embedding-001 (3072-dim) vs local embeddinggemma-300M (300-dim)
  • Discovered running 5 agents in production, 3 of which had stale indexes

Workaround

Delete main.sqlite and let the system re-index all memory files from scratch. This works but loses any entries not backed by source files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions