Skip to content

nileshpatil6/LocalGPT-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LocalGPT

Chat with your PDFs — fully local, no API keys, no data leaves your machine

Upload a PDF. Ask questions. Get answers powered by a local LLM running on your own hardware.

Python 3.9+ Flask Ollama License: MIT


How It Works

Your PDF
    │
    ▼
Text extraction (PyPDF2)
    │
    ▼
Chunked into 500-word segments
    │
    ▼
Embedded with nomic-embed-text (via Ollama, runs locally)
    │
    ▼
Stored as embeddings on disk
    │
Your Question ──► embed question ──► cosine similarity ──► top 5 chunks
                                                                │
                                                                ▼
                                                     qwen2.5-coder:7b (local)
                                                     answers from those chunks

Everything runs on your machine. No OpenAI. No Gemini. No data sent anywhere.


Prerequisites

  • Python 3.9+
  • Ollama installed and running

Pull the required models once:

ollama pull nomic-embed-text
ollama pull qwen2.5-coder:7b

Note: qwen2.5-coder:7b is ~4.7 GB. Any other Ollama model works — just update CHAT_MODEL in .env.


Quickstart

git clone https://github.com/nileshpatil6/LocalGPT.git
cd LocalGPT

# Create virtual environment
python -m venv .venv
source .venv/bin/activate      # macOS/Linux
# .venv\Scripts\activate       # Windows PowerShell

# Install dependencies
pip install -r requirements.txt

# Configure (optional — defaults work out of the box)
cp .env.example .env

# Run
python app.py

Open http://localhost:5000 in your browser.


Configuration

Copy .env.example to .env and edit as needed:

Variable Default Description
SECRET_KEY random Flask session secret — set a fixed value to persist sessions across restarts
OLLAMA_BASE_URL http://localhost:11434 Ollama server URL
EMBEDDING_MODEL nomic-embed-text:latest Model used to embed text chunks
CHAT_MODEL qwen2.5-coder:7b Model used to generate answers
PORT 5000 Port for the Flask server

Features

  • Upload multiple PDFs — all are searched simultaneously
  • Cosine similarity retrieval across all uploaded documents
  • Conversation history — last 3 exchanges included in every prompt
  • Delete documents individually
  • Fully responsive UI (mobile-friendly)
  • 100% local — embeddings and answers never leave your machine

Project Structure

LocalGPT/
├── app.py              # Flask app — routes, RAG logic, Ollama calls
├── requirements.txt
├── .env.example
├── static/
│   ├── script.js       # Frontend logic (upload, chat, file list)
│   └── style.css       # Responsive styles
└── templates/
    └── index.html      # Two-panel layout: sidebar + chat

Swapping the LLM

Any model available in Ollama works. Pull it first, then set CHAT_MODEL in .env:

ollama pull llama3.2
# then in .env:
# CHAT_MODEL=llama3.2

License

MIT © 2026 nileshpatil6

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors