PyVector - Lightweight MCP Vector Database

A lightweight vector database implementation with Model Context Protocol (MCP) server support, designed for local LLM applications. PyVector works with minimal dependencies and provides fallback implementations when optional dependencies are not available.

Features

Fast similarity search using FAISS indexing (with NumPy fallback)
Text embedding generation with sentence transformers (with hash-based fallback)
HTTP server for easy integration with local applications
MCP server integration for seamless LLM tool use (optional)
Multiple index types (flat, IVF, HNSW) for different performance needs
Persistent storage with save/load functionality
Metadata support for rich document storage
Minimal dependencies - works with just NumPy and Pydantic

Installation

# Activate your conda environment
conda activate pyvector

# Install minimal version (NumPy + Pydantic only)
pip install -e .

# Install with full features (recommended)
pip install -e .[full]

# Install specific features
pip install -e .[embeddings]  # Add sentence-transformers
pip install -e .[faiss]       # Add FAISS indexing

Quick Start

MCP Server (Recommended)

Start PyVector as an MCP server for LLM integration:

# Start the MCP server
python start_mcp_server.py

# The script will display connection details like:
# Add this to your MCP client configuration:
# {
#   "mcpServers": {
#     "pyvector": {
#       "command": "python",
#       "args": ["start_mcp_server.py"],
#       "cwd": "/path/to/pyvector"
#     }
#   }
# }

As a Python Library

from pyvector import VectorDatabase

# Create database (uses fallback implementations if needed)
db = VectorDatabase()

# Add some texts
db.add_text("The quick brown fox jumps over the lazy dog")
db.add_text("Machine learning is a subset of artificial intelligence")
db.add_text("Vector databases enable semantic search capabilities")

# Search for similar content
results = db.search_text("AI and machine learning", k=2)
for vector_id, score, metadata in results:
    print(f"Score: {score:.4f} - {metadata['text']}")

As an MCP Server

# Start MCP server (auto-detects capabilities)
python start_mcp_server.py

# Force HTTP server mode
python start_mcp_server.py --http

# Custom host/port
python start_mcp_server.py --http --host 0.0.0.0 --port 9000

The startup script will display connection details and available endpoints.

As an HTTP Server

from pyvector.simple_server import PyVectorHTTPServer

# Start HTTP server
server = PyVectorHTTPServer("localhost", 8080)
server.start()

# Server provides REST API endpoints:
# GET  /health - Health check
# GET  /info - Database information  
# POST /create_database - Create new database
# POST /add_text - Add text to database
# POST /search_text - Search for similar texts
# POST /save_database - Save database to disk
# POST /load_database - Load database from disk

Fallback Implementations

PyVector gracefully handles missing optional dependencies:

No FAISS: Uses pure NumPy implementation for vector indexing
No sentence-transformers: Uses hash-based text embeddings
Warnings displayed: Clear indication when fallbacks are used

This ensures PyVector works in minimal environments while providing better performance when full dependencies are available.

Examples

Run the included examples:

# Basic usage example
python examples/basic_usage.py

# HTTP server example (requires requests)
pip install requests
python examples/http_server_example.py

# Start MCP server with connection details
python start_mcp_server.py --verbose

Configuration

Index Types

flat - Exact search, best for small datasets (<10K vectors)
ivf - Inverted file index, good balance of speed/accuracy (requires FAISS)
hnsw - Hierarchical navigable small world, fastest approximate search (requires FAISS)

Embedding Models

When sentence-transformers is available:

all-MiniLM-L6-v2 - Default lightweight model (384 dimensions)
all-mpnet-base-v2 - Higher quality (768 dimensions)
paraphrase-multilingual-MiniLM-L12-v2 - Multilingual support

When using fallback: Hash-based embeddings (384 dimensions)

Development

# Install development dependencies
pip install -e .[dev]

# Run tests
python test_basic.py

# Format code
black .

# Lint code
flake8 .

Architecture

Core: Vector database and embedding generation with fallbacks
Server: HTTP server for REST API access
Simple Server: MCP server implementation (when MCP available)
Utils: Validation and helper functions
Storage: FAISS-based or NumPy-based indexing with metadata persistence

Dependencies

Required

numpy>=1.21.0 - Core numerical operations
pydantic>=2.0.0 - Data validation

Optional

faiss-cpu>=1.7.0 - High-performance vector indexing
sentence-transformers>=2.2.0 - Quality text embeddings
mcp>=1.0.0 - Model Context Protocol server support

License

GNU General Public License v3.0 - see LICENSE file for details.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.kiro/steering		.kiro/steering
examples		examples
pyvector		pyvector
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py
start_mcp_server.py		start_mcp_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyVector - Lightweight MCP Vector Database

Features

Installation

Quick Start

MCP Server (Recommended)

As a Python Library

As an MCP Server

As an HTTP Server

Fallback Implementations

Examples

Configuration

Index Types

Embedding Models

Development

Architecture

Dependencies

Required

Optional

License

About

Uh oh!

Releases

Packages

Languages

License

Modzer0/pyvector

Folders and files

Latest commit

History

Repository files navigation

PyVector - Lightweight MCP Vector Database

Features

Installation

Quick Start

MCP Server (Recommended)

As a Python Library

As an MCP Server

As an HTTP Server

Fallback Implementations

Examples

Configuration

Index Types

Embedding Models

Development

Architecture

Dependencies

Required

Optional

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages