Skip to content

feat: Support for vector search with Qdrant#381

Open
Anush008 wants to merge 7 commits intoMemPalace:developfrom
Anush008:main
Open

feat: Support for vector search with Qdrant#381
Anush008 wants to merge 7 commits intoMemPalace:developfrom
Anush008:main

Conversation

@Anush008
Copy link
Copy Markdown

@Anush008 Anush008 commented Apr 9, 2026

Hi 👋

Description

This PR adds support for using Qdrant as a vector search provider in MemPalace.

Qdrant is an open-source vector search engine for high performance and massive scale.

Changes

  • Replaced the current inline Chroma usage with a common interface for vector search backends.
  • Added concrete implementations using Chroma and Qdrant.
  • The Qdrant client dependency is marked optional, keeping Chroma as the default backend (current behaviour remains unchanged).

Testing

I've unit tested this integration against a local Qdrant instance.

Setup

You can run Qdrant with

docker run -p 6333:6333 qdrant/qdrant

Set

MEMPALACE_VECTOR_BACKEND=qdrant
MEMPALACE_QDRANT_URL=http://localhost:6333/

The dashboard is accessible at http://localhost:6333/dashboard

Checklist

  • Tests pass (python -m pytest tests/ -v)
  • No hardcoded paths
  • Linter passes (ruff check .)

@DanyLeeTW
Copy link
Copy Markdown

Comprehensive Code Review Available

A detailed review of this PR has been conducted and is available at: docs/pr-reviews/pr-381-qdrant-vector-search.md

📊 Review Summary

Architecture: ⭐⭐⭐⭐½ (4.5/5)
Code Quality: ⭐⭐⭐⭐ (4/5)
Test Coverage: ⭐⭐ (2/5) - Needs improvement
Overall: ⭐⭐⭐½ (3.5/5) - Good with room for improvement

🎯 Key Findings

Strengths:

  • ✅ Clean Protocol-based architecture
  • ✅ Excellent backward compatibility
  • ✅ Flexible configuration system
  • ✅ Proper optional dependency handling

Critical Issues (Must Fix Before Merge):

  1. Missing Error Handling (HIGH priority)

    • No helpful ImportError when qdrant-client not installed
    • Users get cryptic errors instead of install guidance
  2. Global State (MEDIUM priority)

    • ChromaDB backend uses module-level globals
    • Causes thread safety issues and testing difficulties
  3. Missing Test Coverage (MEDIUM priority)

    • No tests for Qdrant backend
    • No backend switching tests
    • Missing configuration validation tests

📋 Recommended Actions

P0 - Must Fix (2-3 hours):

  • Add ImportError handling with install instructions
  • Add configuration validation (qdrant_url required)
  • Add basic backend tests

P1 - Should Fix (4-6 hours):

  • Refactor to class-based ChromaBackend
  • Add comprehensive test coverage
  • Document embedding model compatibility

🔗 Full Analysis

The complete 508-line review includes:

  • Detailed code analysis with line numbers
  • Implementation code examples
  • Test templates
  • Migration guides
  • Rollback procedures

Location: docs/pr-reviews/pr-381-qdrant-vector-search.md

Estimated Effort: 7-11 hours to merge-ready state


Review conducted using comprehensive PR analysis methodology. All findings documented with actionable recommendations.

@Anush008
Copy link
Copy Markdown
Author

Anush008 commented Apr 9, 2026

With regards to the AI recommended actions above,

  • The Chroma implementation is kept as is. It doesn't need to be wrapped in another class.
  • Import errors are indeed handled gracefully.
  • I've not added test code since there isn't a test suite set up for the other components. I've tested the vector search integration locally. Thoroughly.

Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice abstraction layer. The VectorCollection protocol in backends/__init__.py is the right approach — it lets integrators swap backends without touching application code.

One thing I noticed: the Qdrant path returns a proper QdrantCollection class that explicitly implements VectorCollection, but the ChromaDB path returns native ChromaDB collection objects that satisfy the protocol through duck typing. This works today, but it's fragile — if a future ChromaDB release renames or reorders parameters on .query() or .get(), the protocol match breaks silently (no isinstance check, no type error at construction time).

Might be worth wrapping Chroma in a thin adapter class too (like ChromaCollection(VectorCollection)), even if it's just delegation. That way both backends fail the same way if the contract drifts.

Also: The _scroll_cursor state on QdrantCollection makes .get() with pagination stateful at the instance level. If two callers paginate the same collection concurrently (e.g., MCP handler + CLI status), they'll clobber each other's cursor. The ChromaDB path doesn't have this issue because offset/limit are stateless. Consider making scroll state caller-managed or using a separate iterator pattern.

@Anush008
Copy link
Copy Markdown
Author

the ChromaDB path returns native ChromaDB collection objects that satisfy the protocol through duck typing. This works today, but it's fragile — if a future ChromaDB release renames or reorders parameters on .query() or .get(), the protocol match breaks

It's highly unlikely that Chroma would do this without a major release. If they did, even wrapping it in ChromaCollection(VectorCollection) won't save us. It'll break too.

@bensig bensig changed the base branch from main to develop April 11, 2026 22:22
igorls added a commit that referenced this pull request Apr 12, 2026
Formalizes the BaseCollection/BaseBackend contract introduced as a seam
in #413 into an interchangeability spec that third-party backends can
build to. Driven by six in-flight backend PRs (#574, #643, #665, #697,
#700, #381) each implementing the interface differently.

Key decisions captured: entry-point distribution, typed QueryResult/
GetResult replacing Chroma dict shape, daemon-first multi-palace model
via PalaceRef, required where-clause subset (incl. $contains),
mandatory embedder injection with model-identity validation, capability
tokens, shared pytest conformance suite, and a backend-neutral
migrate/verify CLI.
@igorls igorls added area/cli CLI commands area/install pip/uv/pipx/plugin install and packaging area/mcp MCP server and tools area/mining File and conversation mining area/search Search and retrieval enhancement New feature or request storage labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands area/install pip/uv/pipx/plugin install and packaging area/mcp MCP server and tools area/mining File and conversation mining area/search Search and retrieval enhancement New feature or request storage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants