docs: RFC 001 — storage backend plugin specification#743
docs: RFC 001 — storage backend plugin specification#743
Conversation
Formalizes the BaseCollection/BaseBackend contract introduced as a seam in #413 into an interchangeability spec that third-party backends can build to. Driven by six in-flight backend PRs (#574, #643, #665, #697, #700, #381) each implementing the interface differently. Key decisions captured: entry-point distribution, typed QueryResult/ GetResult replacing Chroma dict shape, daemon-first multi-palace model via PalaceRef, required where-clause subset (incl. $contains), mandatory embedder injection with model-identity validation, capability tokens, shared pytest conformance suite, and a backend-neutral migrate/verify CLI.
There was a problem hiding this comment.
Pull request overview
This PR introduces RFC 001, a draft specification for MemPalace storage backend plugins. It formalizes a stable contract so third-party backends can be distributed as mempalace-<name> Python packages and loaded into MemPalace via entry-point discovery, enabling multi-backend / multi-palace (daemon-first) deployments.
Changes:
- Adds a full draft spec defining the
BaseCollection/ backend lifecycle contract, including typedQueryResult/GetResultshapes. - Specifies backend discovery/selection (entry points + registry), configuration shape, and capability-token conventions.
- Defines required filter dialect behavior, migration/verification expectations, and a shared backend conformance test suite concept.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot review flagged back-references in §1.4 and §6 that still used the pre-skuznetsov-rename names (`$contains_fast`, `sync_capable`, `change_feed`). Updated to the `supports_*` prefix used in the §2.1 capability table.
Incorporates review feedback from skuznetsov (Postgres, #665) and dekoza (Lance, #574) on issue #737: - §1.5: split 'accepts embeddings=' (signature compliance) from 'persists embeddings as-is' (correctness). Adds supports_embeddings_passthrough capability; the former is universal, the latter is required to label a migration lossless. - §1.5: model identity check becomes a three-state machine (known_match / known_mismatch / unknown) so legacy palaces without recorded identity don't hard-fail on upgrade. - §1.4: makes explicit that supports_contains_fast is the ONLY performance floor the spec promises; without it callers MUST assume O(n). $contains is a correctness requirement, not a performance one. - §3.3: clarifies auto-detect is an upgrade-compat path only, never the selection mechanism for new palaces. - §8.2: migrate CLI refuses to run against a target lacking supports_embeddings_passthrough unless --accept-re-embed is passed; migration record now captures lossless status and model identities.
dekoza
left a comment
There was a problem hiding this comment.
Reviewed from the LanceDB (#574) perspective. No show-stoppers — our implementation is already close to the spec's target shape. Notes below.
§1.1 — kwargs-only signatures
Our backends/lance.py already uses kwargs-only on add/upsert, and our BaseCollection ABC in backends/base.py does the same (work done in #575, easily backported). query/get/delete currently use **kwargs catch-all — those need explicit parameter lists to match the spec.
§1.1 — query_embeddings and where_document
Our query() doesn't accept query_embeddings and neither query() nor get() accepts where_document. Both are straightforward for LanceDB:
query_embeddings: skip the embed step, search the vector directlywhere_document: same$containspath (Tantivy FTS orLIKEfallback)
§1.1 — update() method
Our BaseCollection and LanceCollection already have an update() method (fetches existing records, merges changes, re-upserts). The spec's BaseCollection does not include update(), though supports_update exists as a capability token. Is update() intentionally excluded from the required interface, or an oversight? If it stays out, we'd keep it as a LanceDB-specific extension — but it seems generally useful.
§1.3 — Typed results
Our return shapes already match QueryResult/GetResult field structure. Migration is wrapping dict construction in dataclass constructors. No concern.
§1.5 — Three-state model identity
We persist embedding_model per-record in metadata_json but not at collection level. Spec wants collection-level persistence + check on open. LanceDB supports table-level metadata via Arrow schema metadata — clean fit. The unknown state for legacy palaces is the right call; hard-failing on upgrade would break existing users.
§2.3 — BaseBackend class
We already have LanceBackend in backends/lance.py with a get_collection() factory method (from #575, easily backported to #574). It currently takes palace_path: str — adapting to PalaceRef is straightforward. It does not yet subclass a BaseBackend ABC (spec §2.3), but the method shape is close.
§3.3 — Auto-detection
Our backends/__init__.py already has detect_backend() sniffing for .lance directories. Adapting to a BaseBackend.detect(path) -> bool classmethod is trivial. Agree with the spec's framing that auto-detection is a migration compatibility path, not a selection mechanism.
§6 — Sync compatibility note
Our sync implementation uses a _raw=True flag on upsert() to skip sync metadata injection during sync apply. The spec's BaseCollection.upsert() has no such parameter, which is correct — sync concerns should not leak into the collection contract. This means we need to refactor sync injection out of LanceCollection into a wrapper layer, which aligns with §6's "sync is a separate subsystem" stance. Wanted to flag this since other backends implementing sync will face the same design question.
§10 — Cleanup gating
+1 on landing the Chroma direct-import cleanup before backend PRs rebase. The seven files importing chromadb directly would break any pure-Lance deployment. This cleanup benefits all backend authors equally.
§11 — Alignment effort for #574
Assessment is accurate. We already have backends/ with BaseCollection ABC, LanceBackend, and kwargs-only signatures (work from #575, easily backported). Remaining work: explicit param lists on query/get/delete (instead of **kwargs), typed results, $contains/where_document, collection-level model identity, PalaceRef adoption, conformance test subclass, sync injection refactor, and package extraction to mempalace-lance. All small-to-medium. No architectural conflicts.
§12 — Open questions
changes_sincewith collection filter — yes, useful. Our sync tracks changes per-table; filtering by collection name is natural.- Per-palace capabilities — worth having.
supports_contains_fastdepending on whether an FTS index exists is a real case for us. run_maintenancereturn shape — structured return preferred. LanceDB compaction can report fragment count before/after and bytes reclaimed, useful for operator dashboards and benchmark reporting.
|
Reviewed the actual RFC diff. From the Postgres / Three small spec-clarity points I would consider before backend authors start rebasing:
The Goals section says
§1.5 says backends MUST NOT hardcode models and must validate model identity/dimension, but §2.1 / §5 allow
§7.3 says benchmarks should run explicit None of these change the shape of the RFC. They are mostly guardrails to keep the spec precise once implementations begin targeting it. |
There was a problem hiding this comment.
Reviewed from the perspective of a 134K-drawer production deployment on the Chroma backend, with experience evaluating LanceDB (via Karta's embedded LanceDB + SQLite architecture).
No show-stoppers. This is the right time to formalize the backend contract — the six-way divergence is already causing friction. Notes below.
§1.3 — Typed results (migration cost for existing forks/consumers)
This is a breaking change for anyone with code touching mcp_server.py or searcher.py that assumes dict returns. My fork has ~20 changes across those files — all dict-shaped. The migration path is clear (wrap in dataclass constructors, same as dekoza noted), but the blast radius is worth calling out: it's not just backend authors who need to update, it's everyone downstream who consumes query results. A compatibility shim (.to_dict() on the result types) during the transition would make the upgrade gentler for plugin authors and forks.
§1.5 — Three-state model identity is exactly right
My palace has 134K drawers with no recorded embedder identity. Hard-failing on upgrade would brick every pre-v1 palace. The unknown → warn → record-on-next-write path is the correct design. One detail: the spec says the identity is recorded "on the next successful write, reindex, or migration." For read-heavy deployments that rarely write new drawers, this means the palace could stay in unknown state indefinitely. Consider also recording on explicit mempalace verify or mempalace status (read-only operations that already touch the collection) so operators can resolve it without forcing a write.
§2.5 — Concurrency guarantee matches real usage
The spec says backends can assume single-thread access per collection, with core serializing per palace. This matches how the MCP server actually works today — good call. My fork added threading.Lock to the graph cache (#661) because build_graph() was the one place concurrent access was possible. The per-palace serialization guarantee would have made that unnecessary. Worth documenting this guarantee prominently so backend authors don't over-synchronize.
§7.3 — Benchmark methodology is the most underrated section
At 134K drawers on Chroma, I've hit the HNSW pathology firsthand — the graph's on-disk size and query latency behave very differently before and after compaction, and after external writes that invalidate the in-memory index (the exact problem my stale HNSW mtime detection fix addresses in #663). The three-phase benchmark requirement (post-bulk-load, post-background-maintenance, post-explicit-maintenance) would have caught the performance cliff I found empirically. Strong +1 on maintenance_state() being part of the published numbers.
§8 — Migration at scale
At 134K drawers, re-embedding is not a minor cost — it's hours of compute depending on the model. The lossless vs re-embed distinction and the --accept-re-embed explicit opt-in are the right design. One request: mempalace migrate should report progress (rows migrated / total, ETA) for large palaces. A 134K-drawer migration that runs silently for hours will get killed by impatient operators.
§10 — Cleanup prerequisite will affect fork contributors
My fork's mcp_server.py imports chromadb directly for the BLOB seq_id repair (#664) and mtime-based cache invalidation (#663). Both are deeply Chroma-specific. The cleanup PR routing all callers through BaseCollection is the right gating decision, but it will require Chroma-specific fixups to move behind the backend boundary. Backend-specific maintenance operations (like BLOB repair) might need a BaseBackend.repair() or similar escape hatch — otherwise they end up as out-of-band scripts that bypass the abstraction.
§12 — Open questions
- Per-palace capabilities — yes.
supports_contains_fastdepending on whether an FTS index exists is a real case. I'd add:supports_mtime_detectionor similar for backends where external-write detection is possible (Chroma via SQLite mtime) vs not (server-mode backends where the filesystem isn't local). My #663 fix is Chroma-specific precisely because mtime detection only makes sense for local file-backed stores.
Addresses the actual spec defects flagged in #743 review, ignoring operator-UX asks that are not plugin-contract concerns. - Goal #3: 'without data loss' → mirrors §8.2's capability-conditional lossless-vs-reembed framing. No more overpromise. - §1.5: `server_embedder` is no longer an implicit escape hatch from identity/dimension rules. Such backends MUST expose an effective identity via `effective_embedder_identity()` and are bound by the same three-state check. - §7.3: adds `maintenance_kinds: ClassVar[frozenset[str]]` advertisement mechanism. `run_maintenance(kind)` must raise UnsupportedMaintenanceKindError for unadvertised kinds. Benchmark harness reads this set rather than guessing kind names. Reserves `analyze`/`compact`/`reindex` as well-defined names. - §1.2: adds `update()` as optional method with a default get+merge+ upsert implementation. §2.1: `supports_update` redefined to gate atomic single-round-trip semantics (not mere capability), since the default impl already supports partial updates. Operator asks explicitly NOT adopted (diplomatic shims, not contract defects): `.to_dict()` compat on typed results, migration progress reporting, `BaseBackend.repair()` separate from `run_maintenance`, per-palace capability variance, identity recording on read-only ops.
|
Thanks @skuznetsov @dekoza @jphein — substantive reviews on all sides. Commit 922aa99 closes the four spec defects the reviews surfaced. A number of reasonable operator asks are explicitly not being adopted; explaining both below, so the boundary is clear for anyone implementing against this. Adopted — spec defects1. Goals wording (skuznetsov) — Goal 3 now mirrors §8.2 instead of overpromising "without data loss." The spec delivers lossless migration when capabilities allow, re-embedding otherwise. Both are explicit; neither is hidden. 2.
3. 4. Not adopted — operator comfort, not plugin-contract concernsThese are all reasonable asks, and I want to be explicit that they're not dismissed because they're wrong — they're not adopted because they don't belong in this spec. A well-engineered plugin contract has a small, stable surface. Layering operator-UX conveniences into the core interface makes it worse for every future backend author.
NetFour spec edits, small and targeted. The contract is tighter for it. The rejected items are good ideas for the implementations — CLI progress, fork-friendly shims, richer operator tooling — they just don't live in the plugin contract. #743 is ready for another look. |
|
Thanks, this addresses my Postgres / |
#757 landed mtime/inode cache invalidation and mempalace_reconnect in mcp_server._get_client(). Both are Chroma-specific (stat of chroma.sqlite3). They should migrate into ChromaBackend.get_collection and ChromaBackend.close_palace during the §10 cleanup so the freshness contract lives inside the backend, not in the caller.
Prerequisite for RFC 001 (plugin spec, #743). Removes every direct `import chromadb` outside the ChromaDB backend itself so the core modules depend only on the backend abstraction layer. Extends ChromaBackend with make_client, get_or_create_collection, delete_collection, create_collection, and backend_version. Adds update() to the BaseCollection contract. Non-backend callers (mcp_server, dedup, repair, migrate, cli) now go through the abstraction; tests patch ChromaBackend instead of chromadb. With this landed, the RFC 001 spec can be enforced and PalaceStore (#643) can ship as a plugin without touching core modules.
|
Quick coordination question from the PostgreSQL / pg_sorted_heap backend side: Is RFC 001 now stable enough to use as the target contract for reworking #665, or would you prefer that implementation PRs wait until #743 is merged and the §10 backend cleanup follow-up starts/lands? From my side, the expected #665 rewrite would target the RFC shape directly:
I’m asking to avoid rebasing the current conflicting branch into an intermediate shape if the intended path is now RFC-first. |
|
Read the full spec. From the multi-tenant hosted side (#697), this looks solid. Works for us
Questions1. Namespace isolation — security boundary or naming convention? For us, namespace is a tenant isolation boundary. When a user searches across personal + team vaults, the sidecar resolves which namespaces to query based on a gateway-authorized team_ids list. A team namespace that isn't in that list gets rejected. §4.4 says "the backend uses it as given" — should backends also enforce namespace isolation, or is that purely the caller's job? 2. PalaceRef.id uniqueness scope Unique globally or per backend instance? 3. Typed results (§1.3) — hard cutover? Is there a transition period, or do all callers need to migrate from Chroma dicts in one go? We have 30 MCP tool functions consuming dict shapes. 4. §7.4 NAMESPACE_MEMPALACE Worth assigning now — the placeholder blocks Qdrant PRs. 5. §10 mcp_server caching We hit a related bug (collection cache bleed between tenants). Moving caching into #697 alignmentOur |
Summary
pip install mempalace-<name>packages that drop into core without patches.Tracking issue: #737 (see discussion and follow-up comments for the design rationale).
Key decisions in the draft
mempalace.backends;pip installis sufficient to add a backend.QueryResult/GetResultdataclasses replace Chroma's dict shape from day one.ChromaCollectiongets a thin adapter.PalaceRef(id, local_path?, namespace?)replacespalace_path: str. Backend instances are long-lived and multi-palace; thread-safe across palaces.$eq, $ne, $in, $nin, $and, $or, $contains; unknown operators MUST raiseUnsupportedFilterError(silent drop forbidden).model_nameand raiseEmbedderIdentityMismatchErroron swap (not just dimension).supports_*naming, free-form strings, extensible by third parties.AbstractBackendContractSuitepytest mixin + optional parametrized run of the core suite across all backends underMEMPALACE_TEST_ALL_BACKENDS=1.maintenance_state()/run_maintenance(kind); published numbers must cover three phases (post-load, post-native-maintenance, post-explicit-maintenance).mempalace migrate+mempalace verifyoperating throughBaseCollectiononly — no per-backend migration code.Impact on in-flight PRs
Each of #574, #643, #665, #697, #700, #381 is called out in §11 with the specific alignment work required — #574 is closest to the final shape, #697's
collection_prefixconcern dissolves intoPalaceRef.namespace, #700 and #381 need the canonical UUID5 namespace.Gating cleanup (not in this PR)
Seven files still import
chromadbdirectly (repair.py,dedup.py,cli.py×2,mcp_server.py,migrate.py, plus an instructions doc). Combined with the dict-to-typed-result migration, this needs its own PR landing before the spec can be enforced. Called out in §10.Test plan
changes_sincefilter, per-palace capability query,run_maintenancereturn shape)NAMESPACE_MEMPALACEUUID assigned (§7.4)AbstractBackendContractSuite+ entry-point discovery +PalaceRouter+PalaceRef