Skip to content

Latest commit

 

History

History
52 lines (33 loc) · 1.59 KB

File metadata and controls

52 lines (33 loc) · 1.59 KB

README.md

Screenshot 2025-09-05 230306

RAG Pipeline with LangChain and Google Generative AI

This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using LangChain, ChromaDB, and Google Generative AI. It scrapes course content from a website, splits the text, embeds it, and enables question-answering over the retrieved context.

Features

  • Web scraping using LangChain Community's WebBaseLoader
  • Document splitting with RecursiveCharacterTextSplitter
  • Embedding generation via Google Generative AI
  • Vector storage and retrieval using ChromaDB
  • RAG pipeline for concise question answering
  • Prompt debugging with custom print function

Setup

  1. Install dependencies:

    pip install langchain_community langchainhub chromadb langchain langchain-google-genai langchain-openai
  2. Set your Google API key:

    • If using Google Colab, store your key in Colab's userdata.
    • Otherwise, set GOOGLE_API_KEY in your environment.

Usage

Open RAG.ipynb and run the cells sequentially:

Example

rag_chain.invoke("Is there any free courses?")

Project Structure

  • RAG.ipynb: Main notebook containing all code for scraping, embedding, and RAG pipeline.

License

This project is for