Skip to content

Agentic RAG to help you build a startup🚀

License

Notifications You must be signed in to change notification settings

agenthimzz/ragcoon

 
 

Repository files navigation

RAGcoon🦝

Agentic RAG to help you build your startup

If you find RAGcoon userful, please consider to donate and support the project:

GitHub Sponsors Badge

ragcoon Logo

Install and launch🚀

The first step, common to both the Docker and the source code setup approaches, is to clone the repository and access it:

git clone https://github.com/AstraBert/ragcoon.git

Once there, you can choose one of the two following approaches:

Docker (recommended)🐋

Required: Docker and docker compose

mv .env.example .env
  • Launch the Docker application:
# If you are on Linux/macOS
bash start_services.sh
# If you are on Windows
.\start_services.ps1

Or, if you prefer:

docker compose up qdrant -d
docker compose up frontend -d
docker compose up backend -d

You will see the frontend application running on http://localhost:8001/ and you will be able to use it. Depending on your connection and on your hardware, the set up might take some time (up to 30 mins to set up) - but this is only for the first time your run it!

Source code🖥️

Required: Docker, docker compose and conda

  • Add the groq_api_key in the .env.examplefile and modify the name of the file to .env in the scripts folder. Get this key:
mv .env.example scripts/.env
  • Set up RAGcoon using the dedicated script:
# For MacOs/Linux users
bash setup.sh
# For Windows users
.\setup.ps1
  • Or you can do it manually, if you prefer:
docker compose up qdrant -d
conda env create -f environment.yml
conda activate ragcoon
  • Now launch the frontend application
gunicorn frontend:me --bind 0.0.0.0:8001
  • And then, from another terminal window, go into scripts and launch the backend:
uvicorn main:app --host 0.0.0.0 --port 8000

You will see the application running on http://localhost:8001 and you will be able to use it.

How it works

Workflow

The main workflow is handled by a Query Agent, built upon the ReAct architecture. The agent exploits, as a base LLM, the latest reasoning model by Qwen, i.e. QwQ-32B, provisioned by Groq.

  1. The question coming from the frontend (developed with Mesop - running on http://localhost:8001) is sent into a POST request to the FastAPI-managed API endpoint on http://localhost:8000/chat.
  2. When the Agent is prompted with the user's question, it tries to retrieved relevant context routing the query to one of three query engines:
    • If the query is simple and specific, it goes for a direct hybrid retrieval, exploiting both a dense (Alibaba-NLP/gte-modernbert-base) and a sparse (Qdrant/bm25) retriever
    • If the query is general and vague, it first creates an hypothetical document, which is embedded and used for retrieval
    • If the query is complex and involves searching for nested information, the query is decomposed into several sub-queries, and the retrieval is performed for all of them, with a summarization step in the end
  3. The agent evaluates the context using llama-3.3-70B-versatile, provisioned through Groq. If the context is deemed relevant, the Agent proceeds, otherwise it goes back to retrieval, trying a different method.
  4. The agent produces a candidate answer
  5. The agent evaluates the faithfulness and relevancy of the candidate response, in light of the retrieved context, using LlamaIndex evaluation methods
  6. If the response is faithful and relevant, the agent returns the response, otherwise it gets back at generating a new one.

Contributing

Contributions are always welcome! Follow the contributions guidelines reported here.

License and rights of usage

The software is provided under MIT license.

About

Agentic RAG to help you build a startup🚀

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.3%
  • Shell 1.3%
  • Other 1.4%