What happened?
tool_check_duplicate in mcp_server.py and Layer3.search() in layers.py can return negative similarity scores for very dissimilar content.
Both use round(1 - dist, 3) to convert a ChromaDB cosine distance into a similarity score. With hnsw:space=cosine, ChromaDB distances are in the range [0, 2] — not [0, 1]. For maximally dissimilar vectors the distance slightly exceeds 1.0, making 1 - dist negative.
The rest of the codebase already has the correct pattern: searcher.py line 285 uses round(max(0.0, 1 - dist), 3). The two affected sites are missing the max(0.0, ...) clamp.
Affected lines:
mempalace/mcp_server.py — tool_check_duplicate(): similarity = round(1 - dist, 3)
mempalace/layers.py — Layer3.search(): similarity = round(1 - dist, 3)
What did you expect?
Similarity scores should always be in [0.0, 1.0]. A score of -0.004 is meaningless and could confuse AI clients that read or display the value (e.g. tool_check_duplicate returns the similarity in its JSON response, and Layer3 renders it as (sim=-0.004) in the memory context block).
How to reproduce:
-
Install mempalace and run the snippet below (no palace required — uses an in-memory ChromaDB client).
-
Run:
import chromadb
client = chromadb.Client()
col = client.get_or_create_collection("bug_demo", metadata={"hnsw:space": "cosine"})
col.add(
ids=["drawer_1"],
documents=["The Pythagorean theorem states that a^2 + b^2 = c^2 in a right triangle."],
metadatas=[{"wing": "math", "room": "geometry"}],
)
results = col.query(
query_texts=["Chocolate cake recipe with vanilla frosting and strawberries."],
n_results=1,
include=["distances"],
)
dist = results["distances"][0][0]
print(f"distance : {dist:.4f}") # e.g. 1.0035
print(f"similarity: {round(1 - dist, 3)}") # e.g. -0.004 ← negative!
- Observe that
similarity is negative.
Output:
distance : 1.0035
similarity: -0.004
Environment:
- OS: macOS
- Python version: 3.11.9
- MemPal version: git SHA
2792ce8 (develop)
What happened?
tool_check_duplicateinmcp_server.pyandLayer3.search()inlayers.pycan return negative similarity scores for very dissimilar content.Both use
round(1 - dist, 3)to convert a ChromaDB cosine distance into a similarity score. Withhnsw:space=cosine, ChromaDB distances are in the range[0, 2]— not[0, 1]. For maximally dissimilar vectors the distance slightly exceeds1.0, making1 - distnegative.The rest of the codebase already has the correct pattern:
searcher.pyline 285 usesround(max(0.0, 1 - dist), 3). The two affected sites are missing themax(0.0, ...)clamp.Affected lines:
mempalace/mcp_server.py—tool_check_duplicate():similarity = round(1 - dist, 3)mempalace/layers.py—Layer3.search():similarity = round(1 - dist, 3)What did you expect?
Similarity scores should always be in
[0.0, 1.0]. A score of-0.004is meaningless and could confuse AI clients that read or display the value (e.g.tool_check_duplicatereturns the similarity in its JSON response, and Layer3 renders it as(sim=-0.004)in the memory context block).How to reproduce:
Install mempalace and run the snippet below (no palace required — uses an in-memory ChromaDB client).
Run:
similarityis negative.Output:
Environment:
2792ce8(develop)