Skip to content

hmishra2250/zingly-bot

Repository files navigation

AI-Powered Q&A Assistant

Project demo: https://zingly-bot-891407508879.asia-southeast1.run.app

alt text

Enter a user_id and start chatting. Also the UI only supports pressing the send button and enter key is not supported.

This project demonstrates an AI-powered Q&A assistant capable of answering user queries by leveraging both a pre-trained Large Language Model (LLM) and external APIs (referred to as "tools"). This README provides a comprehensive guide to setting up, running, and understanding the project.

Project Overview

The assistant is designed to handle two types of queries:

  1. General Knowledge Queries: Answered directly by the LLM (OpenAI's GPT-3.5-turbo in this implementation).
  2. Real-time/Specialized Data Queries: Answered by calling external APIs (OpenWeatherMap for weather and Alpha Vantage for stock prices).

The system also incorporates a memory mechanism using FAISS, allowing it to maintain context across multiple turns of a conversation.

Project Structure

The codebase is organized into modular components:

  • tools/: This directory houses the implementations for interacting with external APIs.

    • weather_tool.py: Contains the get_weather function, which fetches weather data from OpenWeatherMap based on a provided location.
    • stock_tool.py: Contains the get_stock_price function, which retrieves the current stock price from Alpha Vantage for a given ticker symbol.
    • tool_registry.py: A centralized registry that manages access to the available tools. This makes adding new tools easier.
  • llm_module/: This module is responsible for all interactions with the LLM.

    • llm_interaction.py: The core logic resides here. It handles:
      • Prompt engineering to guide the LLM's responses.
      • Making calls to the OpenAI API.
      • Determining when to use a tool based on the user's query and the LLM's initial response.
  • memory/: This module handles the conversational context.

    • context_manager.py: Implements a ContextManager class that uses FAISS (Facebook AI Similarity Search) to store and retrieve relevant conversation history. This enables the assistant to "remember" previous interactions within a session.
  • app.py: The main Flask application. This is the entry point for the API. It handles:

    • Receiving user requests (queries).
    • Retrieving and updating conversational context using memory/context_manager.py.
    • Interacting with the LLM module (llm_module/llm_interaction.py).
    • Returning responses to the user.
  • Dockerfile: Defines how to build a Docker image for the application, ensuring consistent and reproducible deployments.

  • .env: A file for storing sensitive information like API keys. Important: This file should never be committed to your version control system (e.g., Git). Add it to your .gitignore file.

  • requirements.txt: Specifies the Python dependencies required to run the project.

Prerequisites

Before running the application, you'll need the following:

  1. API Keys:

    • OpenAI API Key: Create an account and obtain an API key from https://platform.openai.com/. This is necessary for interacting with the LLM.
    • OpenWeatherMap API Key: Sign up for a free account and get an API key from https://openweathermap.org/. This is used for fetching weather data.
    • Alpha Vantage API Key: Obtain a free API key from https://www.alphavantage.co/. This is required for retrieving stock price information.
  2. Docker (Recommended): Install Docker Desktop or Docker Engine. Docker simplifies deployment and ensures that the application runs consistently across different environments. You can download it from https://www.docker.com/products/docker-desktop/.

Setup and Running (Using Docker - Preferred Method)

Docker provides the easiest and most reliable way to run the application.

  1. Clone the Repository:

    git clone https://github.com/hmishra2250/zingly-bot.git 
    cd zingly-bot 
  2. Create the .env File: Create a file named .env in the root directory of the project. Add your API keys to this file in the following format:

    OPENAI_API_KEY=your_actual_openai_api_key
    OPENWEATHERMAP_API_KEY=your_actual_openweathermap_api_key
    ALPHA_VANTAGE_API_KEY=your_actual_alpha_vantage_api_key
    

    Again, ensure this file is not committed to version control.

  3. Build the Docker Image: From the project's root directory, run the following command:

    docker build -t zingly-bot .

    This command builds a Docker image named zingly-bot based on the instructions in the Dockerfile. The -t flag tags the image with a name.

  4. Run the Docker Container: Once the image is built, run the container with:

    docker run -p 8080:8080 zingly-bot

    This command starts the container and maps port 8080 on your host machine to port 8080 inside the container. The -p flag publishes the port. The application will now be accessible at http://localhost:8080.

Setup and Running (Without Docker)

If you prefer not to use Docker, follow these steps:

  1. Clone the Repository: (Same as step 1 in the Docker instructions).

  2. Create a Virtual Environment: It's highly recommended to use a virtual environment to isolate project dependencies.

    python3 -m venv .venv         # Create a virtual environment named .venv
    source .venv/bin/activate    # Activate the environment (Linux/macOS)
    .venv\\Scripts\\activate       # Activate the environment (Windows)
  3. Install Dependencies: Install the required Python packages using pip:

    pip install -r requirements.txt
  4. Create the .env File: (Same as step 2 in the Docker instructions).

  5. Run the Flask Application: Start the Flask development server:

    export FLASK_APP=app.py     # Set the FLASK_APP environment variable
    flask run --host=0.0.0.0 --port=8080  # Run the server, making it accessible from other devices on your network

    The --host=0.0.0.0 option is important; it tells Flask to listen on all public IPs, not just localhost.

Usage (Interacting with the API)

The Q&A assistant is accessed via a simple REST API. You can interact with it using tools like curl, Postman, or any programming language that can make HTTP requests.

The API endpoint is /chat. It accepts POST requests with a JSON payload. The JSON payload should have the following structure:

{
    "user_id": "a_unique_user_identifier",
    "query": "The user's question or statement"
}
  • user_id: A string that uniquely identifies the user. This is used to maintain separate conversation contexts for different users. You can use any string (e.g., "user123", "session-abc", etc.).
  • query: The user's input text.

Example using curl:

  • First interaction (getting weather):

    curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user123", "query": "What is the weather in London?"}' http://localhost:8080/chat
  • Follow-up question (leveraging context):

    curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user123", "query": "What about New York?"}' http://localhost:8080/chat

    Because we used the same user_id, the system will remember the previous question about London and understand that "New York" refers to the weather.

  • Asking about stock prices:

    curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user456", "query": "What is the stock price of AAPL?"}' http://localhost:8080/chat
  • Another stock query, building on context

    curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user456", "query": "and GOOG?"}' http://localhost:8080/chat

The API will return a JSON response with a single field, response, containing the assistant's answer:

{
    "response": "The weather in London is overcast clouds with a temperature of 10.5°C."
}

Areas for Improvement and Future Work

  • More Robust Tool Selection: The current implementation relies on simple keyword matching and basic prompt engineering to determine when to use a tool. This can be significantly improved using:

    • OpenAI Function Calling: A more structured way to define tools and have the LLM reliably call them. This is the recommended approach for production systems.
    • Few-Shot Examples: Include examples of queries and the corresponding tool calls within the system prompt provided to the LLM.
    • Fine-Tuning: Fine-tune the LLM on a dataset specifically designed for this task (queries paired with correct tool calls).
  • Enhanced Error Handling: The current error handling is basic. It should be expanded to:

    • Handle API rate limits gracefully (e.g., using retries with exponential backoff).
    • Provide more informative error messages to the user.
    • Implement fallback mechanisms (e.g., if one API is unavailable, try another).
  • Security Considerations: For a production environment, implement security best practices, including:

    • Input Validation: Sanitize all user inputs to prevent injection attacks.
    • Output Sanitization: Ensure that the assistant's responses do not contain sensitive information or harmful content.
    • Authentication and Authorization: Implement mechanisms to control access to the API.
    • Rate Limiting: Protect against denial-of-service attacks.
  • Scalability: To handle increased load, consider the following:

    • Caching: Cache LLM responses and API results to reduce latency and API usage.
    • Asynchronous Processing: Use a task queue (e.g., Celery or RQ) to handle API calls and LLM interactions asynchronously, improving responsiveness.
    • Load Balancing: Distribute traffic across multiple instances of the application using a load balancer.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published