GitHub - bazzi24/RAGEve: Local-first RAG platform — Ollama + Qdrant + Redis + MinIO. No cloud, no API keys, runs entirely on your machine.

Local-first RAG platform — Fast, private, no cloud required.

Get Started · Configuration · Develop · Community · Contributing

Table of Contents

What is RAGEve?
Demo
Latest Updates
Key Features
System Architecture
Get Started
Launch Service from Source for Development
- Backend only (technical users)
Community
Contributing

What is RAGEve?

RAGEve is a local-first RAG (Retrieval-Augmented Generation) platform built for developers and teams who want the power of RAG workflows without depending on external cloud services.

It combines Ollama for local LLM inference and embeddings, Qdrant as a high-performance vector database, and FastAPI + Next.js for a full-featured web interface. Everything runs on your own machine — no API keys, no data leaves your network.

Backend Architecture:

FastAPI for high-performance async APIs
Peewee ORM with a 27-table schema for persistent storage
MySQL (or SQLite for single-node) via Peewee + connection pooling (900 connections)
SQLAlchemy (temporary) for legacy chat history storage during migration
Qdrant for vector search with hybrid retrieval (dense + sparse)
Ollama for embeddings and LLM inference

RAGEve is designed for two audiences:

User	Experience
Non-technical users	`git clone && ./scripts/run.sh` — everything starts automatically
Developers	`./scripts/backend.sh` or manual `uvicorn` / `npm run dev` for full control

Demo

Start RAGEve locally and open http://localhost:3000:

git clone https://github.com/bazzi24/RAGEve.git
cd RAGEve
./scripts/run.sh

Tip: On first run, install.sh automatically installs uv, Ollama, pulls the required models (~8 GB), and starts Docker services. This takes about 5–10 minutes once, then subsequent starts are instant.

Latest Updates

2026-04-27 Peewee migration complete — 27-table schema, new /dialogs and /knowledgebases APIs, transitional support for legacy routes
2026-04-22 Enhanced PDF parsing — column detection, structured table extraction, hierarchical chunking, reading order optimization
2026-04-03 Evaluation matrix (16-cell benchmark) + Qdrant hybrid search fix
2026-04-01 9 production fixes: structured 500 handler, health checks, rate limiter proxy safety, request timeouts, streaming 404 fix, file upload limits, paginated datasets API
2026-04-01 Chat history with MySQL/SQLite, session panel, per-agent conversations
2026-03-28 Background HF dataset ingest with live progress tracking
2026-03-26 Real-time streaming upload with per-batch progress stages
2026-03-26 Cross-encoder reranking (sentence-transformers)
2026-03-25 E2E test suite, conversation persistence

Key Features

Deep Document Understanding

Ingest PDFs, Word docs, Excel, CSV, images, and more
Enhanced PDF parsing: column detection, structured table extraction (markdown), heading hierarchy, reading order optimization
Adaptive chunking with quality scoring per profile (clean text, OCR noisy, table-heavy, code)
Intelligent text column selection for multi-column datasets
Hierarchical chunking preserves section context for better semantic search

Grounded Answers with Citations

Exact chunk references from source documents
Quality scores exposed to the LLM via enriched context
Session history-aware chat with up to 6 prior turns in context

Multiple Retrieval Strategies

Dense vector search via Ollama embeddings
Sparse keyword search
Hybrid fusion combining both with configurable weights
Cross-encoder reranking for improved precision

Flexible LLM Support

Any Ollama model as the chat backend
Any Ollama embedding model
Configurable temperature, top-k, top-p, and context window size per dialog (agent)

HuggingFace Integration

Browse, preview, and search HuggingFace datasets directly from the UI
Download datasets to local storage
Background ingest with real-time progress
Multi-config and multi-split support

Persistent Conversations

Sessions stored in MySQL via Peewee ORM (or SQLite for single-node)
Full conversation history per dialog (agent)
Thumbs up/down feedback on individual messages
Conversation context automatically injected into subsequent turns

Production-Ready Backend

Request ID tracing and structured error responses
CORS and API key authentication
Circuit breaker and retry logic for Ollama calls
Dependency health checks (/health pings Ollama and Qdrant)

Developer-Friendly

scripts/run.sh — everything in one command
scripts/backend.sh — backend only for technical users
Docker Compose for infrastructure (Qdrant + MySQL)
Full E2E and stress test suites

Get Started

Prerequisites

Requirement	Version	Notes
Docker	>= 24.0.0	Install Docker
Docker Compose	>= v2.26.1	Usually bundled with Docker Desktop
macOS / Linux / WSL2	—	Windows native not supported; use WSL2
Disk	>= 50 GB	For models (~8 GB) and data
RAM	>= 16 GB	Recommended; CPU fallback is slower

Windows: Enable WSL2 and run all commands from inside the WSL shell. Do not run scripts from PowerShell or CMD.

Quick Start

One command for everything — auto-installs if needed:

git clone https://github.com/bazzi24/RAGEve.git
cd RAGEve
./scripts/run.sh

The first run will:

Install uv (Python package manager)
Install Ollama and pull models (nomic-embed-text + llama3.2)
Start Docker containers (Qdrant + MySQL)
Start the FastAPI backend and Next.js frontend

Open http://localhost:3000 when you see:


  
██████╗  █████╗  ██████╗ ███████╗██╗   ██╗███████╗    
██╔══██╗██╔══██╗██╔════╝ ██╔════╝██║   ██║██╔════╝    
██████╔╝███████║██║  ███╗█████╗  ██║   ██║█████╗      
██╔══██╗██╔══██║██║   ██║██╔══╝  ╚██╗ ██╔╝██╔══╝      
██║  ██║██║  ██║╚██████╔╝███████╗ ╚████╔╝ ███████╗    
╚═╝  ╚═╝╚═╝  ╚═╝ ╚═════╝ ╚══════╝  ╚═══╝  ╚══════╝    
                                                      

AI-powered RAG platform — Ollama · Qdrant · FastAPI · Next.js
https://github.com/bazzi24/RAGEve
[*] Starting FastAPI backend...
[*] Starting Next.js frontend...
[✓] RAGEve is running!

  Frontend  http://localhost:3000
  Backend   http://localhost:8000
  API docs  http://localhost:8000/docs

Press Ctrl+C to stop all services cleanly.

Configuration

RAGEve uses environment variables for configuration. Copy the example and customize:

# From the project root:
cp docker/.env.example .env  # Recommended for Docker deployments
# OR
cp .env.example .env        # If .env.example exists (legacy location)

Launch Service from Source for Development

For developers who want full control over startup and debugging.

Full Stack

# 1. Start infrastructure
docker compose -f docker/docker-compose.yml up -d qdrant mysql

# 2. Start Ollama (keep running in a terminal)
ollama serve

# 3. Pull required models (first time only)
ollama pull nomic-embed-text
ollama pull llama3.2:latest

# 4. Install Python dependencies
uv sync

# 5. Install frontend dependencies
cd frontend && npm install && cd ..

# 6. Start FastAPI backend (port 8000)
#    Do NOT use --reload — it crashes in-flight uploads
uv run uvicorn backend.main:app --host 0.0.0.0 --port 8000

# 7. Start Next.js frontend (port 3000) — in another terminal
cd frontend && npm run dev

Open:

Frontend: http://localhost:3000
API Docs: http://localhost:8000/docs
Qdrant Dashboard: http://localhost:6333/dashboard

Backend Only

For developers who run the frontend manually (e.g. in an IDE with hot reload):

./scripts/backend.sh

Starts: Docker (Qdrant + MySQL) → Ollama → FastAPI. No frontend.

Community

Bug Reports — report issues with clear reproduction steps
Feature Requests — open a discussion or issue
Contributing — see below

Contributing

RAGEve grows through open-source collaboration. Contributions of all kinds are welcome — bug fixes, features, docs, tests, and feedback.

Before contributing:

Fork the repository and create a feature branch from main
Make your changes — all code must pass bash -n scripts/*.sh (shell scripts) and cd frontend && npx tsc --noEmit (TypeScript)
Run the E2E test suite: uv run python test/_test_e2e.py
Submit a pull request with a clear description of what changed and why

Development setup:

git clone https://github.com/bazzi24/RAGEve.git
cd RAGEve
cp .env.example .env    # optional: fill in HF_TOKEN, API_KEY, etc.
./scripts/install.sh  # one-time setup
./scripts/backend.sh  # backend only for iterative development

Built with ❤️ for local-first AI — RAGEve

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
.github		.github
backend		backend
core		core
docker		docker
docs/assets		docs/assets
frontend		frontend
rag		rag
scripts		scripts
test		test
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
backup.sh		backup.sh
main.py		main.py
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is RAGEve?

Demo

Latest Updates

Key Features

Deep Document Understanding

Grounded Answers with Citations

Multiple Retrieval Strategies

Flexible LLM Support

HuggingFace Integration

Persistent Conversations

Production-Ready Backend

Developer-Friendly

Get Started

Prerequisites

Quick Start

Configuration

Launch Service from Source for Development

Full Stack

Backend Only

Community

Contributing

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is RAGEve?

Demo

Latest Updates

Key Features

Deep Document Understanding

Grounded Answers with Citations

Multiple Retrieval Strategies

Flexible LLM Support

HuggingFace Integration

Persistent Conversations

Production-Ready Backend

Developer-Friendly

Get Started

Prerequisites

Quick Start

Configuration

Launch Service from Source for Development

Full Stack

Backend Only

Community

Contributing

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages