Skip to content

memory-lancedb: support OpenAI-compatible local embeddings (baseUrl + dimensions) and fix vector-size mismatch #17564

@lumenradley

Description

@lumenradley

Issue Prompt: memory-lancedb local embeddings support (LM Studio)

Hello!
I want to enable my local mac LM Studio embedings model, and asked LLM to solve it and prepare as a prompt, so you can generate the solution yourself too. It was tested to work and after here its AI written. Let me know what you think about such approach, I can participate more.

Title

memory-lancedb: support OpenAI-compatible local embeddings (baseUrl + dimensions) and fix vector-size mismatch

Problem

memory-lancedb is hard to use with local OpenAI-compatible embedding endpoints (for example LM Studio/Ollama proxies).

Before fix:

  • Config only allowed OpenAI model enum values.
  • No embedding.baseUrl for custom endpoints.
  • No embedding.dimensions override for non-OpenAI models.
  • Using SDK embedding calls against LM Studio could yield vector-size mismatch vs LanceDB schema in practice.

Reproduction

  1. Configure plugins.slots.memory = "memory-lancedb".
  2. Set embedding model to local nomic GGUF (text-embedding-nomic-embed-text-v1.5@q8_0) on LM Studio.
  3. Run memory tools (memory_store then memory_recall).
  4. Observe LanceDB query errors due vector dimension mismatch.

Expected behavior

  • Plugin accepts any OpenAI-compatible model id.
  • Plugin can target custom embedding endpoint via embedding.baseUrl.
  • Plugin can pin vector size via embedding.dimensions.
  • Stored/search vectors use consistent dimension and work end-to-end with local providers.

Proposed solution

  1. Extend plugin config schema:
    • embedding.baseUrl?: string
    • embedding.dimensions?: integer
    • remove hardcoded embedding.model enum restriction
  2. Use configured dimensions when initializing LanceDB vector schema.
  3. Replace the openai SDK with raw HTTP fetch() to /embeddings for broader local-provider compatibility. Add AbortSignal.timeout(30_000) to prevent indefinite hangs when a local server is unresponsive. Remove the openai dependency from package.json.
  4. Split embedding dimensions into two concerns:
    • requestDimensions: only sent in the API request body when the user explicitly sets embedding.dimensions in config. This avoids breaking local providers that reject unknown request fields.
    • expectedDimensions: always validated against the response vector length (derived from LanceDB schema). This catches dimension mismatches early instead of surfacing as cryptic LanceDB errors.
  5. Support ${ENV_VAR} substitution in embedding.baseUrl (same pattern as embedding.apiKey).
  6. Validate embedding.dimensions range (1..32768) and reject empty/whitespace baseUrl.
  7. Update plugin manifest (openclaw.plugin.json) UI hints and JSON schema.
  8. Add tests for:
    • custom model + baseUrl + dimensions
    • env var resolution for baseUrl
    • unknown model requiring dimensions
    • invalid dimensions (type, range: 0, fractional)
    • empty baseUrl rejection

Files touched (fork reference)

  • extensions/memory-lancedb/config.ts
  • extensions/memory-lancedb/index.ts
  • extensions/memory-lancedb/openclaw.plugin.json
  • extensions/memory-lancedb/index.test.ts
  • extensions/memory-lancedb/package.json (remove openai dependency)
  • pnpm-lock.yaml (lockfile cleanup)

Acceptance criteria

  • memory-lancedb can run with LM Studio embeddings.
  • memory_store, memory_recall, memory_forget work with local endpoint.
  • Plugin config validates correctly for both OpenAI hosted and local OpenAI-compatible providers.
  • Existing OpenAI usage remains backward compatible (no dimensions sent in request body for default models).

Validation performed (fork)

  • Unit tests: pnpm vitest run extensions/memory-lancedb/index.test.ts — 15/15 passed
  • Live smoke (LM Studio on Tailscale, text-embedding-nomic-embed-text-v1.5@q8_0, 768 dims):
    • memory_store -> created
    • memory_recall -> found stored memory
    • memory_forget -> deleted stored memory
    • Auto-recall hook: memory-lancedb: injecting 3 memories into context
    • Graceful degradation: app kept running when embedding server had no model loaded, auto-recovered once model was available
  • Full suite: pnpm build && pnpm check && pnpm test — pass (8 pre-existing upstream failures only)

Risk notes

  • Local providers may vary in /embeddings behavior; dimension enforcement surfaces misconfiguration early.
  • For custom models, embedding.dimensions should be explicitly set.
  • The openai SDK included automatic retry logic for transient errors (429, 500). The raw fetch replacement does not retry, but for a memory plugin making ~1 call per message, rate limiting is unlikely. Failures degrade gracefully via the existing try/catch in lifecycle hooks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions