Skip to content

[Feature] Support standalone ChromaDB HTTP Client to prevent SQLite concurrency crashes #832

@TengJoe

Description

@TengJoe

🐛 Problem / Issue

When running mcp_server and mempalace mine simultaneously (or running multiple concurrent miners), users frequently encounter fatal crashes such as:
chromadb.errors.InternalError: Error in compaction: Failed to apply logs to the hnsw segment writer

This occurs because ChromaBackend is currently hardcoded to unconditionally instantiate chromadb.PersistentClient(). According to ChromaDB's local persistence architecture, the embedded SQLite database and HNSW segment writers are inherently not thread/process-safe for concurrent writers. As a result, running the MCP server and sync routines synchronously triggers destructive segment locks, inevitably corrupting the index data.

💡 Solution / Changes

To support advanced workflows, MemPalace needs the ability to connect to a standalone Chroma Server via HTTP (chroma run), which handles concurrent queries and index serialization safely.

Changes introduced in this PR:

  1. Modified config.py to support chroma_server_host and chroma_server_port properties via environmental variables or config.json.
  2. Modified ChromaBackend.get_collection() in backends/chroma.py. If a host configuration is provided, the backend elegantly bypasses local file bindings (e.g. chmod, directory creations) and switches to an isolated chromadb.HttpClient.

Important Usage Note for users testing this PR:
If users utilize a global Web Proxy (e.g., HTTP_PROXY), they should be reminded to correctly configure NO_PROXY=localhost,127.0.0.1 locally, otherwise httpx underlying HttpClient might inadvertently route internal localhost traffic to the proxy, resulting in 502 Bad Gateways.

💻 Code Changes (Files)

1. mempalace/config.py

@@ -174,6 +174,22 @@
         return self._file_config.get("hall_keywords", DEFAULT_HALL_KEYWORDS)
 
+    @property
+    def chroma_server_host(self):
+        """Host for ChromaDB standalone server. If set, HttpClient is used."""
+        env_val = os.environ.get("MEMPALACE_CHROMA_HOST")
+        if env_val:
+            return env_val
+        return self._file_config.get("chroma_server_host")
+
+    @property
+    def chroma_server_port(self):
+        """Port for ChromaDB standalone server."""
+        env_val = os.environ.get("MEMPALACE_CHROMA_PORT")
+        if env_val:
+            return int(env_val)
+        return self._file_config.get("chroma_server_port", 8000)

2. mempalace/backends/chroma.py

@@ -7,6 +7,7 @@
 
 import chromadb
 
+from ..config import MempalaceConfig
 from .base import BaseCollection
 
 logger = logging.getLogger(__name__)
@@ -71,18 +72,24 @@
     """Factory for MemPalace's default ChromaDB backend."""
 
     def get_collection(self, palace_path: str, collection_name: str, create: bool = False):
-        if not create and not os.path.isdir(palace_path):
-            raise FileNotFoundError(palace_path)
-
-        if create:
-            os.makedirs(palace_path, exist_ok=True)
-            try:
-                os.chmod(palace_path, 0o700)
-            except (OSError, NotImplementedError):
-                pass
-
-        _fix_blob_seq_ids(palace_path)
-        client = chromadb.PersistentClient(path=palace_path)
+        cfg = MempalaceConfig()
+        host = cfg.chroma_server_host
+
+        if host:
+            client = chromadb.HttpClient(host=host, port=cfg.chroma_server_port)
+        else:
+            if not create and not os.path.isdir(palace_path):
+                raise FileNotFoundError(palace_path)
+
+            if create:
+                os.makedirs(palace_path, exist_ok=True)
+                try:
+                    os.chmod(palace_path, 0o700)
+                except (OSError, NotImplementedError):
+                    pass
+
+            _fix_blob_seq_ids(palace_path)
+            client = chromadb.PersistentClient(path=palace_path)
         if create:
             collection = client.get_or_create_collection(collection_name)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions