fix: centralize embedding model config to prevent query/ingest mismatch#912
Open
OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
Open
fix: centralize embedding model config to prevent query/ingest mismatch#912OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
Conversation
Single source of truth for embedding model resolution. Resolves from collection metadata, falls back to MiniLM for legacy palaces. New palaces default to all-mpnet-base-v2 (768-dim). Part of MemPalace#903
Resolves from config.json or MEMPALACE_EMBEDDING_MODEL env var. Used for new palace creation only; existing palaces read from collection metadata. Part of MemPalace#903
get_collection() and get_or_create_collection() now accept optional embedding_function and embedding_model_name params. Model name is stamped into collection metadata on create. Fully backwards compatible. Part of MemPalace#903
On create: stamps new_palace_model() (mpnet) into collection metadata. On read: resolves model from metadata, falls back to MiniLM for legacy. All collection access now uses the correct embedding function. Also fixes tests that opened bare PersistentClient instances without the correct embedding function, causing dimension mismatches (768 vs 384). Part of MemPalace#903
_get_collection() resolves the model from collection metadata and passes the correct embedding_function to ChromaDB. tool_status() reports the active embedding_model. Closes MemPalace#903
- ChromaBackend.create_collection() now accepts embedding_function and embedding_model_name params - cli.py repair, repair.py rebuild_index: read embedding model from existing collection before delete/recreate, preserve it - migrate.py: stamp new_palace_model() on migrated palaces - palace.get_collection(): accept optional config param so CLI mining respects config.json embedding_model setting - Update test_rebuild_index_success to verify new embedding args Addresses code review findings MemPalace#4, MemPalace#5, MemPalace#7 for MemPalace#903
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #903
Adds centralized embedding model configuration so the MCP server, CLI search, and all ingest paths use the same model — fixing silent query failures when models mismatch.
Design Decisions
Storage approach: Embedding model name stored in ChromaDB collection metadata (not a separate file). Atomic with the collection, can't desync. Absence of the key = legacy palace.
Default for new palaces:
all-mpnet-base-v2(768-dim) — better search quality (+3.5pp on LoCoMo R@10 benchmarks over MiniLM).Default for existing palaces:
all-MiniLM-L6-v2(384-dim) — backwards compatible, no re-mining required. Detected by absence ofembedding_modelkey in collection metadata.Resolution chain: Collection metadata (authoritative) > config file / env var (new palaces only) > built-in default. This means once a palace is created, its model is locked in and self-describing.
No migration tool in this PR: Re-embedding existing palaces from MiniLM to mpnet is a separate concern. This PR prevents the mismatch; migrating existing palaces is a follow-up.
All create paths stamp the model: Repair, rebuild, and migrate operations preserve the original model through the delete/recreate cycle.
Resolution chain
Files changed
mempalace/embedding.py— model registry, resolution, embedding function factorymempalace/config.py—embedding_modelproperty on MempalaceConfigmempalace/backends/chroma.py—get_collection(),get_or_create_collection(),create_collection()acceptembedding_function+embedding_model_namemempalace/palace.py— resolves model from metadata on read, stamps on createmempalace/mcp_server.py—_get_collection()uses correct embedding function,tool_status()reports active modelmempalace/cli.py,mempalace/repair.py,mempalace/migrate.py— all collection-create paths now stamp embedding modelKnown follow-ups (not in scope)
get_collectioncall on read paths for metadata resolution (~ms overhead, within budget)Test plan
mempalace_statusreports active embedding modelruff checkandruff formatclean