KagajatAI is a full-featured, portfolio-ready project that demonstrates the entire lifecycle of a modern AI application. It provides a web-based interface to "chat" with your documents, powered by a sophisticated Retrieval-Augmented Generation (RAG) pipeline.
This project goes beyond a simple proof-of-concept by incorporating advanced techniques such as model finetuning, rigorous benchmarking, and a flexible architecture that supports both local open-source models and powerful cloud APIs.
- Interactive Chat Interface: A user-friendly web application built with Streamlit to upload documents and ask questions in natural language.
- Advanced RAG Pipeline: Implements a robust RAG system using a high-performance embedding model (BAAI/bge-large-en-v1.5) and a persistent vector store (ChromaDB).
- Flexible LLM Backend: Seamlessly switch between Google's Gemini Pro API for top-tier performance or a local, finetuned Llama-3-8B model for privacy and customization.
- Efficient Finetuning (QLoRA): Includes a Jupyter notebook to finetune the local LLM model on a custom dataset using QLoRA, the state-of-the-art technique for memory-efficient training.
- Synthetic Dataset Generation: Demonstrates how to use a powerful model (Gemini Pro) to programmatically create a high-quality instruction dataset for finetuning.
- Rigorous Benchmarking: Provides a dedicated notebook to compare the performance of the base model, the finetuned model, and Gemini Pro, allowing for quantitative and qualitative analysis of the finetuning process.
Clone the repository. From inside the repo folder, install the dependencies:
python -m pip install --upgrade -r .\requirements.txtConfigure API Keys:
- The application can use Google's Gemini API. Place the key directly in config/config.yaml.
- You can also download model checkpoints (in my case, I downloaded the lightweight, yet powerful, Llama-3.1-8B-Instruct opensource model). Replace the generation_model_name in config.yaml with the absolute path of the downloaded model.
- Run the Streamlit application:
streamlit run .\app\main_app.pyThe application should now be open and accessible in your web browser.
OR
- You can also use specific file from terminal as its own standalone script:
python -u .\src\rag_pipeline.pyEnsure you have some sample PDF documents in .\data\source_documents\
- To showcase the full capabilities of the project, run the Jupyter notebooks in the following order.
- notebooks/1_Dataset_Creation.ipynb: This notebook uses the Gemini API to generate a finetuning_data.jsonl file from your source document.
- notebooks/2_Finetuning_with_LoRA.ipynb: This notebook uses the generated dataset to finetune the Llama-3-8B model. This requires a CUDA-enabled GPU.
- notebooks/3_Benchmarking.ipynb: After finetuning, run this notebook to compare the responses from the base model, your new finetuned model, and Gemini Pro. Due to hardware limitations, I was unable to finetune and infer on large dataset. For this reason, I didn't implement metrics like ROGUE or BLEU for a quantitative assessment of model performance. The notebooks, once opened, are self explanatory.
- Agentic Capabilities: Expand the system into a multi-tool agent that can not only read documents but also fetch real-time data from external APIs (e.g., stock prices).
- Broader Document Support: Enable compatibility with additional file types such as .docx, .txt, and .html. Incorporate Deep Document Understanding to more effectively parse complex formats like CVs, resumes, journal papers, novels, and presentations.
- UI Enhancements: Add features to the Streamlit app to manage multiple vector stores or highlight the source text in the original document.
- Multi-Model Support: Introduce the ability to seamlessly switch between multiple local models and proprietary LLM API providers with a single click.
*This README.md file has been improved for overall readability (grammar, sentence structure, and organization) using AI tools.