Skip to content

fix: centralize embedding model config to prevent query/ingest mismatch#912

Open
OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
OmkarKirpan:fix/embedding-model-mismatch
Open

fix: centralize embedding model config to prevent query/ingest mismatch#912
OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
OmkarKirpan:fix/embedding-model-mismatch

Conversation

@OmkarKirpan
Copy link
Copy Markdown

@OmkarKirpan OmkarKirpan commented Apr 15, 2026

Summary

Closes #903

Adds centralized embedding model configuration so the MCP server, CLI search, and all ingest paths use the same model — fixing silent query failures when models mismatch.

Design Decisions

  1. Storage approach: Embedding model name stored in ChromaDB collection metadata (not a separate file). Atomic with the collection, can't desync. Absence of the key = legacy palace.

  2. Default for new palaces: all-mpnet-base-v2 (768-dim) — better search quality (+3.5pp on LoCoMo R@10 benchmarks over MiniLM).

  3. Default for existing palaces: all-MiniLM-L6-v2 (384-dim) — backwards compatible, no re-mining required. Detected by absence of embedding_model key in collection metadata.

  4. Resolution chain: Collection metadata (authoritative) > config file / env var (new palaces only) > built-in default. This means once a palace is created, its model is locked in and self-describing.

  5. No migration tool in this PR: Re-embedding existing palaces from MiniLM to mpnet is a separate concern. This PR prevents the mismatch; migrating existing palaces is a follow-up.

  6. All create paths stamp the model: Repair, rebuild, and migrate operations preserve the original model through the delete/recreate cycle.

Resolution chain

1. Collection metadata "embedding_model" key (authoritative, stamped at build)
2. If absent → legacy palace → all-MiniLM-L6-v2
3. config.json "embedding_model" or MEMPALACE_EMBEDDING_MODEL env var → new palace creation only

Files changed

  • New: mempalace/embedding.py — model registry, resolution, embedding function factory
  • Modified: mempalace/config.pyembedding_model property on MempalaceConfig
  • Modified: mempalace/backends/chroma.pyget_collection(), get_or_create_collection(), create_collection() accept embedding_function + embedding_model_name
  • Modified: mempalace/palace.py — resolves model from metadata on read, stamps on create
  • Modified: mempalace/mcp_server.py_get_collection() uses correct embedding function, tool_status() reports active model
  • Modified: mempalace/cli.py, mempalace/repair.py, mempalace/migrate.py — all collection-create paths now stamp embedding model

Known follow-ups (not in scope)

Test plan

  • 936/936 tests pass locally
  • Legacy palace (no metadata key) resolves to MiniLM
  • New palace creation stamps mpnet in collection metadata
  • Config override and env var override work
  • Repair/migrate preserve embedding model through rebuild
  • mempalace_status reports active embedding model
  • ruff check and ruff format clean

Single source of truth for embedding model resolution.
Resolves from collection metadata, falls back to MiniLM for legacy palaces.
New palaces default to all-mpnet-base-v2 (768-dim).

Part of MemPalace#903
Resolves from config.json or MEMPALACE_EMBEDDING_MODEL env var.
Used for new palace creation only; existing palaces read from
collection metadata.

Part of MemPalace#903
get_collection() and get_or_create_collection() now accept optional
embedding_function and embedding_model_name params. Model name is
stamped into collection metadata on create. Fully backwards compatible.

Part of MemPalace#903
On create: stamps new_palace_model() (mpnet) into collection metadata.
On read: resolves model from metadata, falls back to MiniLM for legacy.
All collection access now uses the correct embedding function.

Also fixes tests that opened bare PersistentClient instances without
the correct embedding function, causing dimension mismatches (768 vs 384).

Part of MemPalace#903
_get_collection() resolves the model from collection metadata and
passes the correct embedding_function to ChromaDB. tool_status()
reports the active embedding_model.

Closes MemPalace#903
- ChromaBackend.create_collection() now accepts embedding_function
  and embedding_model_name params
- cli.py repair, repair.py rebuild_index: read embedding model from
  existing collection before delete/recreate, preserve it
- migrate.py: stamp new_palace_model() on migrated palaces
- palace.get_collection(): accept optional config param so CLI mining
  respects config.json embedding_model setting
- Update test_rebuild_index_success to verify new embedding args

Addresses code review findings MemPalace#4, MemPalace#5, MemPalace#7 for MemPalace#903
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP server and tools area/search Search and retrieval bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: embedding model mismatch — MCP server uses MiniLM (384-dim) while ingest can use mpnet (768-dim)

2 participants