fix: replace hardcoded limit=10000 with dynamic col.count() in navigation tools by rolldav · Pull Request #563 · MemPalace/mempalace

rolldav · 2026-04-10T19:32:19Z

Problem

Navigation tools (tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy, tool_diary_read) use col.get(include=["metadatas"], limit=10000) to scan metadata. Palaces with more than 10,000 drawers silently lose wings and rooms from the response.

Observed behavior: A palace with 16,003 drawers across 3 wings returns only the first 10,000 entries (1 wing) from mempalace_status and mempalace_list_wings. The other 2 wings (6,003 drawers) are invisible in navigation tools.

Search is NOT affected — mempalace_search uses ChromaDB's vector query which has no such limit.

Fix

Replace limit=10000 with limit=max(col.count(), 1) in all 5 affected functions. This scales the limit dynamically with the actual collection size, following the ChromaDB batch iteration pattern.

The max(..., 1) guard handles the edge case of an empty collection where col.count() returns 0.

Testing

Verified on a production palace with 16,003 drawers across 3 wings:

Before: mempalace_list_wings → {"sos_app": 10000}
After: mempalace_list_wings → {"sos_app": 12585, "sos_medical_ai": 2553, "sos_playbooks": 865}

Files Changed

mempalace/mcp_server.py — 5 occurrences of limit=10000 replaced with limit=max(col.count(), 1)

Navigation tools (tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy, tool_diary_read) use col.get(limit=10000) to scan metadata. Palaces with >10,000 drawers silently lose wings and rooms from the response. Replace with max(col.count(), 1) so the limit scales with the actual collection size. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Copilot

Pull request overview

This PR fixes navigation tools that previously truncated metadata scans at 10,000 ChromaDB records, which caused wings/rooms to be missing for larger “palaces”.

Changes:

Replaced limit=10000 with limit=max(col.count(), 1) in multiple metadata-scanning tools.
Updated tool_list_rooms to compute the limit dynamically via kwargs.
Updated tool_diary_read to use a dynamic limit when fetching diary entries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    count = col.count()
    wings = {}
    rooms = {}
    try:
-        all_meta = col.get(include=["metadatas"], limit=10000)["metadatas"]
+        all_meta = col.get(include=["metadatas"], limit=max(col.count(), 1))["metadatas"]


    if not col:
        return _no_palace()
    wings = {}
    try:
-        all_meta = col.get(include=["metadatas"], limit=10000)["metadatas"]
+        all_meta = col.get(include=["metadatas"], limit=max(col.count(), 1))["metadatas"]
        for m in all_meta:
            w = m.get("wing", "unknown")


    try:
-        kwargs = {"include": ["metadatas"], "limit": 10000}
+        kwargs = {"include": ["metadatas"], "limit": max(col.count(), 1)}
        if wing:
            kwargs["where"] = {"wing": wing}
        all_meta = col.get(**kwargs)["metadatas"]


    if not col:
        return _no_palace()
    taxonomy = {}
    try:
-        all_meta = col.get(include=["metadatas"], limit=10000)["metadatas"]
+        all_meta = col.get(include=["metadatas"], limit=max(col.count(), 1))["metadatas"]
        for m in all_meta:


    try:
        results = col.get(
            where={"$and": [{"wing": wing}, {"room": "diary"}]},
            include=["documents", "metadatas"],
-            limit=10000,
+            limit=max(col.count(), 1),
        )


    try:
-        all_meta = col.get(include=["metadatas"], limit=10000)["metadatas"]
+        all_meta = col.get(include=["metadatas"], limit=max(col.count(), 1))["metadatas"]
        for m in all_meta:


web3guru888

This is a real bug with real consequences — 16,003 drawers across 3 wings returning only 1 wing from mempalace_list_wings is a silent data loss failure. The before/after in the PR description makes the impact clear.

The fix is correct. limit=max(col.count(), 1) scales dynamically with actual collection size, and the max(..., 1) guard correctly handles the empty-collection edge case where col.count() returns 0 (ChromaDB returns an error on limit=0).

One observation: calling col.count() before col.get() means two round-trips to ChromaDB per navigation call instead of one. For large collections that's probably fine (the get itself is the expensive part), but worth noting if this ever becomes a hot path. An alternative would be to fetch with a large-but-bounded limit and paginate, but that's a more complex change and the current fix is the right call for now.

All 5 affected functions are covered. LGTM.

[MemPalace-AGI integration — production stats at https://milla-jovovich.github.io/mempalace/integrations/mempalace-agi/]

web3guru888 · 2026-04-10T20:04:12Z

This fix is long overdue — we hit the 10K limit ourselves on a palace with ~16K drawers and spent a while debugging why tool_list_wings was silently returning an incomplete set. Not a fun bug to diagnose.

The col.count() > limit guard pattern is the right approach. One thing worth confirming in the implementation: does iter_metadatas() (from PR #482, if that's what you're building on) use offset-based pagination consistently across all five navigation tools, or does each tool still implement its own loop?

The 10K case is the most visible failure mode, but there's a subtler one: if col.count() itself is called on every page of a paginated scan, you're adding a small but consistent overhead per batch. For the navigation tools this is fine (infrequent calls), but worth noting for diary-read which can be called in tight loops.

Also — has this been tested against palaces that are just over the limit (e.g. 10,001 drawers)? The off-by-one at the exact boundary is worth a test case.

Formatted · 2026-04-10T20:15:50Z

This conflicts with #482, which takes a different approach to the same problem. Rather than scaling the limit dynamically, #482 extracts a shared iter_metadatas(col, where=None, batch=500) generator into palace.py that pages through the collection in 500-item batches — and consolidates mcp_server.py, palace_graph.py, and miner.py onto it. The limit=max(col.count(), 1) fix works for correctness but on a 125k-drawer palace it issues one col.get() returning 125k records at once, whereas pagination keeps memory flat regardless of palace size.

jphein · 2026-04-10T21:51:57Z

Note: #562 includes the same fix — col.count() + paginated metadata fetch with thousands-separator formatting. Our implementation paginates in batches of 10K with offset rather than a single limit=max(col.count(), 1) call, which avoids loading the entire collection into memory at once for very large palaces.

web3guru888 · 2026-04-11T00:18:06Z

@jphein — good point on the batch-with-offset approach. The limit=max(col.count(), 1) pattern in this PR does fix the correctness issue but materialises the entire collection in one call — on a 125K-drawer palace that's a significant memory spike.

The paginated-batches approach in #562 (10K offsets) is architecturally sounder for large palaces: bounded peak memory, avoids ChromaDB's serialisation overhead on very large response payloads. The trade-off is a bit more code complexity (N round trips vs one), but for a production memory tool the memory bound matters more.

Given that #562 supersedes this fix and adds pagination on top, I'd say this PR's value is as a minimal standalone patch for anyone who needs just this one fix without the full #562 scope — but if #562 merges, this becomes redundant. Worth noting that in the PR description for clarity.

jphein · 2026-04-11T01:41:02Z

Already fixed in #562 — status uses col.count() + paginated metadata fetch instead of the hardcoded limit=10000. Same approach.

rolldav requested review from bensig and milla-jovovich as code owners April 10, 2026 19:32

Copilot AI review requested due to automatic review settings April 10, 2026 19:32

Copilot started reviewing on behalf of rolldav April 10, 2026 19:33 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

web3guru888 approved these changes Apr 10, 2026

View reviewed changes

web3guru888 mentioned this pull request Apr 10, 2026

refactor: consolidate pagination into shared iter_metadatas() (#171) #482

Closed

3 tasks

jphein mentioned this pull request Apr 10, 2026

feat: batch writes, concurrent mining, MCP tools, hooks, export, search improvements #562

Closed

11 tasks

bensig changed the base branch from main to develop April 11, 2026 22:21

igorls added area/mcp MCP server and tools bug Something isn't working labels Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: replace hardcoded limit=10000 with dynamic col.count() in navigation tools#563

fix: replace hardcoded limit=10000 with dynamic col.count() in navigation tools#563
rolldav wants to merge 1 commit intoMemPalace:developfrom
rolldav:fix/navigation-limit-10000

rolldav commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

web3guru888 left a comment

Uh oh!

web3guru888 commented Apr 10, 2026

Uh oh!

Formatted commented Apr 10, 2026

Uh oh!

jphein commented Apr 10, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

rolldav commented Apr 10, 2026

Problem

Fix

Testing

Files Changed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

web3guru888 commented Apr 10, 2026

Uh oh!

Formatted commented Apr 10, 2026

Uh oh!

jphein commented Apr 10, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

jphein commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants