This project demonstrates a Retrieval Augmented Generation (RAG) system that answers questions about company reports using vector search.
Documents (PDF)
↓
Chunking
↓
Embeddings (Sentence Transformers)
↓
FAISS Vector Search
↓
Relevant Text Chunks
↓
Displayed in Streamlit Web App
rag-chatbot
│
├── data
├── vectorstore
├── ingest.py
├── vectorstore.py
├── app.py
└── requirements.txt
-
ingest.py- Loads PDFs
- Splits text into chunks
-
vectorstore.py- Converts chunks into embeddings
- Creates FAISS vector index
-
app.py- Streamlit interface
- User asks question
- System retrieves relevant text chunks
Install dependencies
pip install -r requirements.txt
Create vector database
python ingest.py python vectorstore.py
Run the app
streamlit run app.py
- What risks are mentioned in the report?
- What sustainability initiatives are described?
- What are the main financial highlights?