RAG Document Q&A System

A Retrieval-Augmented Generation (RAG) system that uses OpenAI embeddings and ChromaDB for document question-answering.

Features

Document processing and chunking
OpenAI embeddings for semantic search
ChromaDB for vector storage
Caching system for embeddings
Question answering with context

Setup

Clone the repository:

git clone <your-repo-url>
cd <repo-name>

Create and activate virtual environment:

python -m venv venv
source venv/bin/activate  # On macOS/Linux

Install dependencies:

pip install -r requirements.txt

Create a .env file with your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

Place your text documents in the news_articles directory
Run the application:

python app.py

Project Structure

app.py: Main application code
requirements.txt: Python dependencies
news_articles/: Directory for text documents
chroma_db/: ChromaDB storage (gitignored)
embedding_cache.json: Embedding cache (gitignored)

Usage

Add your text documents to the news_articles directory
Run the application
The system will:
- Process and chunk documents
- Generate embeddings
- Store in ChromaDB
- Allow question answering

Dependencies

Python 3.10+
OpenAI API
ChromaDB
Other requirements in requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
news_articles		news_articles
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Document Q&A System

Features

Setup

Project Structure

Usage

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Document Q&A System

Features

Setup

Project Structure

Usage

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages