📌 Multi-Modal Retrieval-Augmented Generation (RAG) System

A powerful AI-driven Multi-Modal Retrieval-Augmented Generation (RAG) System that extracts, embeds, and retrieves insights from documents, images, and videos—enhancing searchability and knowledge retrieval.

🚀 Features

✅ Extracts text from PDFs and DOCX files.
✅ Uses OCR to extract text from images.
✅ Processes videos by extracting and transcribing audio.
✅ Stores and retrieves data efficiently using FAISS indexing.
✅ Generates AI-powered responses using Mistral-7B.

🛠 Installation

1️⃣ Clone the Repository

git clone https://github.com/your-repo/Multi-Modal-RAG.git
cd Multi-Modal-RAG

2️⃣ Install Dependencies

pip install torch transformers faiss-cpu sentence-transformers opencv-python pytesseract PyPDF2 python-docx moviepy

⚠ Note: If using GPU, install faiss-gpu instead of faiss-cpu for better performance.

3️⃣ Set Up OCR (Tesseract)

Install Tesseract-OCR (Required for image text extraction).
Download Tesseract and set the path in your environment.

📂 File Processing Overview

1️⃣ Text Extraction

File Type	Processing Method
PDFs	Extracted using `PyPDF2`.
DOCX	Extracted using `python-docx`.
Images	Processed via OCR using `pytesseract`.
Videos	Extracted via `MoviePy`, transcribed using an STT model.

🚀 Running the System

1️⃣ Process Files in a Folder

folder_path = "your_data_folder"
texts = process_folder(folder_path)
index, embeddings, texts = embed_texts(texts)

2️⃣ Ask AI Questions

if index is not None:
    while True:
        query = input("Ask a question (or type 'exit' to quit): ")
        if query.lower() == "exit":
            break
        response = retrieve_and_generate(index, embeddings, texts, query)
        print("AI Assistant:", response)

📌 Future Enhancements

🚀 Integrate OpenAI Whisper for real-time speech transcription.
🚀 Improve document parsing with layout-aware models like LayoutLM.
🚀 Optimize FAISS for large-scale knowledge retrieval.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
RAG.py		RAG.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📌 Multi-Modal Retrieval-Augmented Generation (RAG) System

🚀 Features

🛠 Installation

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Set Up OCR (Tesseract)

📂 File Processing Overview

1️⃣ Text Extraction

🚀 Running the System

1️⃣ Process Files in a Folder

2️⃣ Ask AI Questions

📌 Future Enhancements

👨‍💻 Author

📌 Darshan kumar jajoria

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📌 Multi-Modal Retrieval-Augmented Generation (RAG) System

🚀 Features

🛠 Installation

1️⃣ Clone the Repository

2️⃣ Install Dependencies

3️⃣ Set Up OCR (Tesseract)

📂 File Processing Overview

1️⃣ Text Extraction

🚀 Running the System

1️⃣ Process Files in a Folder

2️⃣ Ask AI Questions

📌 Future Enhancements

👨‍💻 Author

📌 Darshan kumar jajoria

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages