Knexa is an open-source, enterprise-grade Retrieval-Augmented Generation (RAG) search engine designed to provide "Perplexity-like" capabilities for internal enterprise knowledge bases. It features a scalable microservices-ready architecture, hybrid search (Dense + Sparse), and a modular pipeline for custom embedding and generation strategies.
- Hybrid Search: Combines semantic understanding (Dense Vector Search via FAISS) with keyword precision (Sparse BM25).
- Custom Pipeline: Flexible ingestion engine with intelligent chunking (Recursive Character Split).
- Enterprise Ready: Built-in Audit Logging, configurable Pydantic settings, and structured error handling.
- RAG Integration: Plug-and-play support for LLMs (OpenAI, standard integrations coming for vLLM/Ollama).
- High Performance: Optimized for low-latency retrieval and generic async FastAPI endpoints.
- Backend: Python 3.9+, FastAPI, Uvicorn
- Vector Database: FAISS (Facebook AI Similarity Search)
- NLP & Embeddings: SentenceTransformers (HuggingFace), NumPy, Scikit-learn
- Storage: SQLite (Audit Logs), Local File Storage (Index)
-
Clone the repository:
git clone https://github.com/yourusername/knexa.git cd knexa -
Set up a virtual environment:
python3 -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt
-
Environment Variables: Create a
.envfile in the root directory:OPENAI_API_KEY=your_key_here DEBUG=True
Run the API server:
uvicorn app.main:app --reloadThe API will be available at http://localhost:8000. You can access the automatic documentation at http://localhost:8000/docs.
curl -X POST "http://localhost:8000/ingest" \
-H "Content-Type: application/json" \
-d '[{"content": "Knexa is a great tool for enterprise search.", "metadata": {"source": "manual"}}]'curl -X POST "http://localhost:8000/search" \
-H "Content-Type: application/json" \
-d '{"query": "What is Knexa?", "top_k": 3}'We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to help build Knexa.
This project is licensed under the MIT License - see the LICENSE file for details.
Built with ❤️ by Kaftandev