fix: batch ChromaDB reads to prevent silent truncation on large palaces#132
Closed
SoundMindsAI wants to merge 1 commit intoMemPalace:mainfrom
Closed
fix: batch ChromaDB reads to prevent silent truncation on large palaces#132SoundMindsAI wants to merge 1 commit intoMemPalace:mainfrom
SoundMindsAI wants to merge 1 commit intoMemPalace:mainfrom
Conversation
col.get() without an explicit limit applies ChromaDB's small internal default, silently dropping drawers from status, Layer1, MCP read tools, miner status, and diary reads. Replace all unbounded col.get() calls with a new get_all() helper that paginates in safe batches. Also fix tilde expansion in config.py palace_path resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collaborator
|
Duplicate of #66 which covers batched ChromaDB reads. Closing. |
Author
|
@bensig Thanks for the quick review. I took a closer look at #66 and I believe this PR covers significantly more ground — would you mind reconsidering? What #66 fixes: Inline batching in What this PR fixes in addition:
Happy to adjust anything if needed — just wanted to make sure the additional coverage wasn't lost. |
6 tasks
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #40.
ChromaDB's
col.get()without an explicitlimitapplies a small internal default (currently 10). On palaces with more drawers than that default, every read-only tool —mempalace status, Layer1 wake-up, and all MCP read tools (status, list_wings, list_rooms, get_taxonomy, diary_read) — silently returns incomplete results. Users see missing wings, undercounted rooms, and truncated Layer1 context with no error or warning. Theminer.pystatus function had alimit=10000that would similarly break once a palace exceeded 10k drawers.PR #120 attempted to address these issues alongside several other fixes but was closed. PR #114 and #119 fixed related problems (shell injection, ChromaDB pin) but did not address the unbounded read truncation.
chromadb_utils.get_all()helper that paginates in safe batches, replace all unboundedcol.get()call sitesconfig.pypalace_path resolution (paths like~/.mempalace/palacefrom config file or env var were passed unexpanded to ChromaDB, causing "palace not found" errors)Changes
New file:
mempalace/chromadb_utils.py— batchedget_all()utilityFixed call sites:
cli.py— status commandlayers.py— Layer1 generationmcp_server.py—tool_status,tool_list_wings,tool_list_rooms,tool_get_taxonomy,tool_diary_readminer.py—status()(previously hardcodedlimit=10000)Config fix:
config.py—os.path.expanduser()on palace_path from file config and env varDocumentation: Added
chromadb_utils.pyto file reference tables inREADME.mdandmempalace/README.mdTest plan
test_chromadb_utils.py— 7 tests: all records, docs+meta, where filter, empty collection, batching, no duplicates, filtered paginationtest_layers.py— 2 tests: Layer1 pulls from all rooms, wing filter excludes other wingstest_mcp_server_reads.py— 9 tests: status, list_wings, list_rooms, taxonomy, diary read (all entries + empty)test_miner_status.py— 2 tests: accurate drawer count, multiple wings reportedtest_config.py— 2 new tests: tilde expansion from file config and env var