Skip to content

SubhojitGhimire/KagajatAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KagajatAI: An End-to-End AI Document Analysis System

Python License

KagajatAI is a full-featured, portfolio-ready project that demonstrates the entire lifecycle of a modern AI application. It provides a web-based interface to "chat" with your documents, powered by a sophisticated Retrieval-Augmented Generation (RAG) pipeline.

This project goes beyond a simple proof-of-concept by incorporating advanced techniques such as model finetuning, rigorous benchmarking, and a flexible architecture that supports both local open-source models and powerful cloud APIs.

Image: Landing Page

Key Features

  • Interactive Chat Interface: A user-friendly web application built with Streamlit to upload documents and ask questions in natural language.
  • Advanced RAG Pipeline: Implements a robust RAG system using a high-performance embedding model (BAAI/bge-large-en-v1.5) and a persistent vector store (ChromaDB).
  • Flexible LLM Backend: Seamlessly switch between Google's Gemini Pro API for top-tier performance or a local, finetuned Llama-3-8B model for privacy and customization.
  • Efficient Finetuning (QLoRA): Includes a Jupyter notebook to finetune the local LLM model on a custom dataset using QLoRA, the state-of-the-art technique for memory-efficient training.
  • Synthetic Dataset Generation: Demonstrates how to use a powerful model (Gemini Pro) to programmatically create a high-quality instruction dataset for finetuning.
  • Rigorous Benchmarking: Provides a dedicated notebook to compare the performance of the base model, the finetuned model, and Gemini Pro, allowing for quantitative and qualitative analysis of the finetuning process.

Requirements and Usage

Clone the repository. From inside the repo folder, install the dependencies:

python -m pip install --upgrade -r .\requirements.txt

Configure API Keys:

  • The application can use Google's Gemini API. Place the key directly in config/config.yaml.
  • You can also download model checkpoints (in my case, I downloaded the lightweight, yet powerful, Llama-3.1-8B-Instruct opensource model). Replace the generation_model_name in config.yaml with the absolute path of the downloaded model.
  1. Run the Streamlit application:
streamlit run .\app\main_app.py

The application should now be open and accessible in your web browser.

OR

  1. You can also use specific file from terminal as its own standalone script:
python -u .\src\rag_pipeline.py

Ensure you have some sample PDF documents in .\data\source_documents\

  1. To showcase the full capabilities of the project, run the Jupyter notebooks in the following order.
  • notebooks/1_Dataset_Creation.ipynb: This notebook uses the Gemini API to generate a finetuning_data.jsonl file from your source document.
  • notebooks/2_Finetuning_with_LoRA.ipynb: This notebook uses the generated dataset to finetune the Llama-3-8B model. This requires a CUDA-enabled GPU.
  • notebooks/3_Benchmarking.ipynb: After finetuning, run this notebook to compare the responses from the base model, your new finetuned model, and Gemini Pro. Due to hardware limitations, I was unable to finetune and infer on large dataset. For this reason, I didn't implement metrics like ROGUE or BLEU for a quantitative assessment of model performance. The notebooks, once opened, are self explanatory.

Future Improvements:

  1. Agentic Capabilities: Expand the system into a multi-tool agent that can not only read documents but also fetch real-time data from external APIs (e.g., stock prices).
  2. Broader Document Support: Enable compatibility with additional file types such as .docx, .txt, and .html. Incorporate Deep Document Understanding to more effectively parse complex formats like CVs, resumes, journal papers, novels, and presentations.
  3. UI Enhancements: Add features to the Streamlit app to manage multiple vector stores or highlight the source text in the original document.
  4. Multi-Model Support: Introduce the ability to seamlessly switch between multiple local models and proprietary LLM API providers with a single click.

Screenshots

Image: Mechanism
Image: Working
Image: llm interface
Image: data processing
Image: rag pipeline

*This README.md file has been improved for overall readability (grammar, sentence structure, and organization) using AI tools.

About

A personal learning project showcasing finetuning, benchmarking and RAG inferencing using local model, as well as Gemini API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors