feat(synapse): Advanced Retrieval — MMR, Pinned Memory, Query Expansion, Supersede Detection, Consolidation Engine (#595) by matrix9neonebuchadnezzar2199-sketch · Pull Request #596 · MemPalace/mempalace

matrix9neonebuchadnezzar2199-sketch · 2026-04-11T03:35:10Z

Implements RFC #595 — five new retrieval phases for Synapse.

What this adds

Phase	Feature	What it does
5	MMR	Deduplicates results using Maximal Marginal Relevance — balances relevance vs diversity
6	Pinned Memory	LTP-based session context — surfaces most-referenced drawers at session start
7	Query Expansion	Broadens search using past query logs — no LLM, fully local
8	Supersede Detection	Identifies outdated drawers via similarity + time gap — filter or annotate mode
9	Consolidation Engine	Merges related drawers into caller-provided summaries — reversible soft-archive

Pipeline

Session start:
  mempalace_session_context → Pinned + Supersede suggestions

Search:
  Query → [Expansion] → ChromaDB → Synapse scoring → [Supersede] → [Consolidation] → [MMR] → Results

Each [step] is independently toggleable via RetrievalProfile.

Design principles

Opt-in, default off. All features gated by profile flags. Existing behavior unchanged.
No external API. Zero network calls. Uses existing ChromaDB vectors and Synapse logs.
No data destruction. Drawers are never deleted. Consolidation is reversible.
Observable. Every step reports what it did in response metadata.

New MCP tools

mempalace_session_context — pinned memories + maintenance suggestions (session startup)
mempalace_supersede_check — palace-wide supersede candidate scan
mempalace_consolidate — merge drawers into summary (caller provides summary text)

Observability examples

MMR:

{"synapse_mmr": {"applied": true, "candidates_before_mmr": 25, "candidates_after_mmr": 10, "dropped_as_redundant": 15}}

Supersede:

{"synapse_supersede": {"checked": true, "superseded_filtered": 1, "detail": [{"filtered_id": "d_003", "replaced_by_id": "d_028", "similarity": 0.91, "age_gap_days": 33}]}}

Query Expansion:

{"synapse_query_expansion": {"applied": true, "original_query": "auth design", "expansion_terms": ["OAuth", "JWT", "PKCE"], "results_from_expansion": 4}}

Files changed

mempalace/synapse.py — MMR, pinned, expansion, supersede, consolidation logic
mempalace/synapse_profiles.py — new defaults, validation rules
mempalace/searcher.py — full pipeline integration
mempalace/mcp_server.py — 3 new MCP tools + schema
tests/test_synapse_advanced.py — 38 new tests

Tests

Suite	Result
`test_synapse_advanced.py`	38 passed
`test_synapse_profiles.py` + `test_synapse.py`	71 passed
Full suite	639 passed, 4 failed (known Windows cp932)

Depends on

PR feat: Synapse Phase 1 — biologically-inspired memory scoring layer (#441) #451 (Synapse Phase 1-3) — scoring layer
PR feat: RetrievalProfile — named Synapse config profiles with inheritance (#489) #519 (RetrievalProfile) — profile system

Profile example

{
  "orient": {
    "mmr_enabled": true, "mmr_lambda": 0.5, "mmr_final_k": 10,
    "query_expansion_enabled": true, "query_expansion_max_terms": 5,
    "supersede_filter_enabled": true, "supersede_action": "filter"
  },
  "decide": {
    "mmr_enabled": true, "mmr_lambda": 0.85, "mmr_final_k": 3,
    "supersede_filter_enabled": true, "supersede_action": "filter",
    "pinned_max_tokens": 1000
  }
}

@web3guru888

… consolidation, pipeline trace (MemPalace#596) - Pinned Memory: retrieval_spread factor boosts drawers referenced from diverse query contexts over single-context high-frequency drawers pinning_score = ltp_score * (1 + 0.2 * spread) - Consolidation: Evaluate mode nests source drawers as metadata inside consolidated drawer instead of returning them as top-level results - Pipeline trace: synapse_pipeline summary in every response with phases_applied, phases_skipped, total_candidates_in/out, elapsed_ms - 10 new tests (48 total in test_synapse_advanced.py) - Addresses @web3guru888 review feedback on MemPalace#595 Made-with: Cursor

web3guru888

Good implementation that delivers everything promised in RFC #595. The five phases are cleanly separated and all observability metadata is present. A few notes from reading the code:

MMR (Phase 5): The lambda_param >= 0.999 short-circuit for pure relevance ranking is a clean optimization — avoids the pairwise loop entirely. The _get_emb() fallback to _rel_score() when embeddings are unavailable is important for graceful degradation. Implementation matches the RFC spec exactly.

Pinned Memory (Phase 6): The retrieval_spread factor incorporated here (pinning_score = ltp_score * (1 + 0.2 * spread)) was a post-RFC design update from the #595 thread — good to see it landed in the implementation. The pinned_reason field in observability includes spread value, which makes the pinning decision auditable.

Query Expansion (Phase 7): The implementation is correct but there's a scalability concern worth addressing before merge: expand_query() fetches the entire query_log table with SELECT query_text, query_embedding, result_ids, result_scores FROM query_log — no WHERE clause, no LIMIT. The idx_query_log_timestamp index exists but is unused. For palaces with months of usage this becomes a full table scan on every search. Suggest adding a configurable lookback:

cutoff = (datetime.now(timezone.utc) - timedelta(days=query_expansion_lookback_days)).isoformat()
cur = conn.execute(
    "SELECT query_text, query_embedding, result_ids, result_scores "
    "FROM query_log WHERE timestamp > ? ORDER BY timestamp DESC",
    (cutoff,),
)

A 30-60 day default covers most useful co-occurrence data while keeping the scan O(recent queries) rather than O(all-time queries).

Supersede Detection (Phase 8): The pairwise similarity computation in detect_superseded() is O(n²) over all drawers in the wing. For a wing with 500+ drawers this is 125,000 similarity computations. The detect_superseded_palace_wide() variant would hit this harder. The current approach is fine for most use cases — just worth documenting the scale constraint.

Consolidation (Phase 9): The collection.add() for the consolidated drawer uses a uuid.uuid4() prefixed ID — good. The source_drawers field stores drawer IDs as JSON in metadata, which is the right approach for ChromaDB's metadata constraints. The bidirectional reference (consolidated_into on sources, source_drawers on consolidated) is present and complete.

Pipeline trace: synapse_pipeline with phases_applied, phases_skipped, elapsed_ms, and profile_used is correctly wired in searcher.py and surfaced through mcp_server.py. This was the one cross-cutting ask from the RFC review thread and it's implemented cleanly.

Tests: 48 tests in test_synapse_advanced.py is solid coverage. The expansion query log unbounded scan concern above is the main actionable item. Everything else is production-ready.

matrix9neonebuchadnezzar2199-sketch · 2026-04-11T03:54:54Z

@web3guru888 All three feedback items implemented in the latest push (80df59a).

1. Pinned Memory — retrieval spread:

{
  "drawer_id": "d_00142",
  "ltp_score": 4.2,
  "retrieval_spread": 4,
  "pinning_score": 7.56,
  "pinned_reason": "ltp_score=4.20, spread=4, pinning_score=7.56 (rank 1 of 10)"
}

pinning_score = ltp_score × (1 + 0.2 × min(spread, 10)). A drawer retrieved from 4 different query contexts now outranks a drawer retrieved 37 times from the same context.

2. Consolidation — source nesting (Evaluate mode):

{
  "id": "d_consolidated_001",
  "synapse_consolidation": {
    "is_consolidated": true,
    "source_count": 3,
    "sources": [
      {"id": "d_003", "title": "Considering PostgreSQL", "date": "2026-02-15"},
      {"id": "d_028", "title": "Decided on SQLite", "date": "2026-03-20"}
    ]
  }
}

Sources are nested metadata, not competing top-level results. Orient sees the summary only; Evaluate sees the summary with expandable sources.

3. Pipeline trace:

{
  "synapse_pipeline": {
    "phases_applied": ["query_expansion", "supersede_filter", "mmr"],
    "phases_skipped": ["consolidation"],
    "total_candidates_in": 42,
    "total_results_out": 8,
    "profile_used": "orient",
    "elapsed_ms": 12.3
  }
}

Top-level diagnostic before diving into per-phase detail. Not present when synapse_enabled=false.

48 advanced tests, 119 synapse total, 649 overall passing.

@web3guru888

- query_expansion_lookback_days (default 60) limits query_log scan - Uses idx_query_log_timestamp index via WHERE timestamp > cutoff - Prevents unbounded table scan for users with months of search history - 2 new tests (50 total in test_synapse_advanced.py) - Addresses @web3guru888 performance observation Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch · 2026-04-11T04:03:39Z

@web3guru888 Good catch — fixed in f621534.

expand_query() now takes query_expansion_lookback_days (default 60, profile-configurable). The scan uses WHERE timestamp > ? ORDER BY timestamp DESC against idx_query_log_timestamp.

{
  "orient": { "query_expansion_lookback_days": 90 },
  "decide": { "query_expansion_lookback_days": 14 }
}

50 advanced tests, 651 total passing. Ready for your OODA workload test whenever you want to run it.

web3guru888 · 2026-04-11T04:03:47Z

@matrix9neonebuchadnezzar2199-sketch — all three items confirmed in 80df59a.

The query_expansion_lookback_days parameter as a configurable window is the right call. Bounded scan with the timestamp index makes the feature scale-safe from day one rather than needing a follow-up optimization after the first complaint from a power user.

A few checks on the updated implementation:

The min(spread, 10) cap in the pinning score formula is good — prevents runaway scores from extremely cross-referenced drawers
Source nesting in Evaluate mode (consolidated summary first, sources as expandable metadata) is the correct retrieval contract
elapsed_ms in pipeline trace is a float — good precision for short-circuit paths where MMR is skipped

639 tests, 48 advanced — solid foundation for the five phases. Updating my review to APPROVE. The query log lookback was the last open item and it's resolved.

web3guru888

All feedback items from my initial review and the #595 RFC thread are implemented in 80df59a. Query expansion lookback bounds the scan, retrieval spread factor in pinned memory scoring, source nesting in Evaluate consolidation, and pipeline trace present throughout. 639 tests passing. LGTM.

…on, Supersede Detection, Consolidation Engine (MemPalace#595) Phases 1-9 of Synapse scoring and retrieval pipeline: - Phase 1-3: LTP scoring, synaptic tagging, association, consolidation - Phase 4: RetrievalProfile with OODA profiles, source annotation, validation - Phase 5: MMR post-processor for deduplication - Phase 6: Pinned Memory with retrieval spread factor - Phase 7: Query Expansion from search logs with lookback window - Phase 8: Supersede Detection with per-profile thresholds - Phase 9: Consolidation Engine with source nesting New MCP tools: mempalace_session_context, mempalace_supersede_check, mempalace_consolidate Pipeline trace observability (synapse_pipeline in response metadata) All features opt-in via RetrievalProfile flags, default off 50 new tests in test_synapse_advanced.py, 119 synapse tests total Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch · 2026-04-14T11:47:14Z

Rebased and squashed onto current develop (post-#852 ChromaBackend refactor). 18 commits → 1 clean commit (ef298fb). All conflicts resolved. Tests: 121 Synapse tests passed, 984 total passed. Ready for review.

…ies) Made-with: Cursor

Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch · 2026-04-14T12:10:26Z

Rebased and squashed onto current develop (post-#852 ChromaBackend refactor). 18 commits → clean history. All conflicts resolved — kept upstream ChromaBackend/WAL changes, integrated Synapse pipeline on top.

Tests: 121 Synapse tests passed (test_synapse_advanced.py 50, test_synapse_profiles.py + test_synapse.py 71), full suite green. CI: 6/6 checks passed.

@bensig @milla-jovovich Ready for review and merge.

…earcher) Made-with: Cursor

…heck, consolidate) Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch requested review from bensig and milla-jovovich as code owners April 11, 2026 03:35

matrix9neonebuchadnezzar2199-sketch mentioned this pull request Apr 11, 2026

[RFC] Synapse Advanced Retrieval — MMR, Pinned Memory, Query Expansion, Supersede Detection, Consolidation Engine #595

Open

web3guru888 reviewed Apr 11, 2026

View reviewed changes

web3guru888 approved these changes Apr 11, 2026

View reviewed changes

bensig changed the base branch from main to develop April 11, 2026 22:21

bensig requested a review from igorls as a code owner April 11, 2026 22:21

Kesshite mentioned this pull request Apr 13, 2026

feat: mempalace_explain — entity-aware search with KG enrichment (decision archaeology) #822

Open

igorls added area/cli CLI commands area/mcp MCP server and tools area/mining File and conversation mining area/search Search and retrieval enhancement New feature or request labels Apr 14, 2026

matrix9neonebuchadnezzar2199-sketch force-pushed the feat/synapse-advanced-retrieval branch from f621534 to ef298fb Compare April 14, 2026 11:44

matrix9neonebuchadnezzar2199-sketch added 3 commits April 14, 2026 20:50

style: fix ruff lint (unused import, F841, C901 noqa for search_memor…

14e8346

…ies) Made-with: Cursor

style: ruff format for CI (ruff format --check .)

9ec2f3a

Made-with: Cursor

style: ruff format remaining test files (ruff 0.4 CI)

159fe28

Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch added 2 commits April 15, 2026 20:40

merge: resolve conflicts with upstream develop (README, mcp_server, s…

6b58055

…earcher) Made-with: Cursor

docs: add Synapse tools to mcp-tools.md (session_context, supersede_c…

3b9353d

…heck, consolidate) Made-with: Cursor

matrix9neonebuchadnezzar2199-sketch mentioned this pull request Apr 15, 2026

RFC: Synapse Phase 10–14 — Model Guard, Cross-Wing Balancing, Score Explainability, Adaptive Compaction, Paginated Scoring #914

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(synapse): Advanced Retrieval — MMR, Pinned Memory, Query Expansion, Supersede Detection, Consolidation Engine (#595)#596

feat(synapse): Advanced Retrieval — MMR, Pinned Memory, Query Expansion, Supersede Detection, Consolidation Engine (#595)#596
matrix9neonebuchadnezzar2199-sketch wants to merge 6 commits intoMemPalace:developfrom
matrix9neonebuchadnezzar2199-sketch:feat/synapse-advanced-retrieval

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

Uh oh!

web3guru888 left a comment

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

web3guru888 left a comment

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 14, 2026

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

What this adds

Pipeline

Design principles

New MCP tools

Observability examples

Files changed

Tests

Depends on

Profile example

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 11, 2026

Uh oh!

web3guru888 commented Apr 11, 2026

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 14, 2026

Uh oh!

matrix9neonebuchadnezzar2199-sketch commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants