Project demo: https://zingly-bot-891407508879.asia-southeast1.run.app
Enter a user_id and start chatting. Also the UI only supports pressing the send button and enter key is not supported.
This project demonstrates an AI-powered Q&A assistant capable of answering user queries by leveraging both a pre-trained Large Language Model (LLM) and external APIs (referred to as "tools"). This README provides a comprehensive guide to setting up, running, and understanding the project.
The assistant is designed to handle two types of queries:
- General Knowledge Queries: Answered directly by the LLM (OpenAI's GPT-3.5-turbo in this implementation).
- Real-time/Specialized Data Queries: Answered by calling external APIs (OpenWeatherMap for weather and Alpha Vantage for stock prices).
The system also incorporates a memory mechanism using FAISS, allowing it to maintain context across multiple turns of a conversation.
The codebase is organized into modular components:
-
tools/
: This directory houses the implementations for interacting with external APIs.weather_tool.py
: Contains theget_weather
function, which fetches weather data from OpenWeatherMap based on a provided location.stock_tool.py
: Contains theget_stock_price
function, which retrieves the current stock price from Alpha Vantage for a given ticker symbol.tool_registry.py
: A centralized registry that manages access to the available tools. This makes adding new tools easier.
-
llm_module/
: This module is responsible for all interactions with the LLM.llm_interaction.py
: The core logic resides here. It handles:- Prompt engineering to guide the LLM's responses.
- Making calls to the OpenAI API.
- Determining when to use a tool based on the user's query and the LLM's initial response.
-
memory/
: This module handles the conversational context.context_manager.py
: Implements aContextManager
class that uses FAISS (Facebook AI Similarity Search) to store and retrieve relevant conversation history. This enables the assistant to "remember" previous interactions within a session.
-
app.py
: The main Flask application. This is the entry point for the API. It handles:- Receiving user requests (queries).
- Retrieving and updating conversational context using
memory/context_manager.py
. - Interacting with the LLM module (
llm_module/llm_interaction.py
). - Returning responses to the user.
-
Dockerfile
: Defines how to build a Docker image for the application, ensuring consistent and reproducible deployments. -
.env
: A file for storing sensitive information like API keys. Important: This file should never be committed to your version control system (e.g., Git). Add it to your.gitignore
file. -
requirements.txt
: Specifies the Python dependencies required to run the project.
Before running the application, you'll need the following:
-
API Keys:
- OpenAI API Key: Create an account and obtain an API key from https://platform.openai.com/. This is necessary for interacting with the LLM.
- OpenWeatherMap API Key: Sign up for a free account and get an API key from https://openweathermap.org/. This is used for fetching weather data.
- Alpha Vantage API Key: Obtain a free API key from https://www.alphavantage.co/. This is required for retrieving stock price information.
-
Docker (Recommended): Install Docker Desktop or Docker Engine. Docker simplifies deployment and ensures that the application runs consistently across different environments. You can download it from https://www.docker.com/products/docker-desktop/.
Docker provides the easiest and most reliable way to run the application.
-
Clone the Repository:
git clone https://github.com/hmishra2250/zingly-bot.git cd zingly-bot
-
Create the
.env
File: Create a file named.env
in the root directory of the project. Add your API keys to this file in the following format:OPENAI_API_KEY=your_actual_openai_api_key OPENWEATHERMAP_API_KEY=your_actual_openweathermap_api_key ALPHA_VANTAGE_API_KEY=your_actual_alpha_vantage_api_key
Again, ensure this file is not committed to version control.
-
Build the Docker Image: From the project's root directory, run the following command:
docker build -t zingly-bot .
This command builds a Docker image named
zingly-bot
based on the instructions in theDockerfile
. The-t
flag tags the image with a name. -
Run the Docker Container: Once the image is built, run the container with:
docker run -p 8080:8080 zingly-bot
This command starts the container and maps port 8080 on your host machine to port 8080 inside the container. The
-p
flag publishes the port. The application will now be accessible athttp://localhost:8080
.
If you prefer not to use Docker, follow these steps:
-
Clone the Repository: (Same as step 1 in the Docker instructions).
-
Create a Virtual Environment: It's highly recommended to use a virtual environment to isolate project dependencies.
python3 -m venv .venv # Create a virtual environment named .venv source .venv/bin/activate # Activate the environment (Linux/macOS) .venv\\Scripts\\activate # Activate the environment (Windows)
-
Install Dependencies: Install the required Python packages using
pip
:pip install -r requirements.txt
-
Create the
.env
File: (Same as step 2 in the Docker instructions). -
Run the Flask Application: Start the Flask development server:
export FLASK_APP=app.py # Set the FLASK_APP environment variable flask run --host=0.0.0.0 --port=8080 # Run the server, making it accessible from other devices on your network
The
--host=0.0.0.0
option is important; it tells Flask to listen on all public IPs, not just localhost.
The Q&A assistant is accessed via a simple REST API. You can interact with it using tools like curl
, Postman, or any programming language that can make HTTP requests.
The API endpoint is /chat
. It accepts POST
requests with a JSON payload. The JSON payload should have the following structure:
{
"user_id": "a_unique_user_identifier",
"query": "The user's question or statement"
}
user_id
: A string that uniquely identifies the user. This is used to maintain separate conversation contexts for different users. You can use any string (e.g., "user123", "session-abc", etc.).query
: The user's input text.
Example using curl
:
-
First interaction (getting weather):
curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user123", "query": "What is the weather in London?"}' http://localhost:8080/chat
-
Follow-up question (leveraging context):
curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user123", "query": "What about New York?"}' http://localhost:8080/chat
Because we used the same
user_id
, the system will remember the previous question about London and understand that "New York" refers to the weather. -
Asking about stock prices:
curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user456", "query": "What is the stock price of AAPL?"}' http://localhost:8080/chat
-
Another stock query, building on context
curl -X POST -H "Content-Type: application/json" -d '{"user_id": "user456", "query": "and GOOG?"}' http://localhost:8080/chat
The API will return a JSON response with a single field, response
, containing the assistant's answer:
{
"response": "The weather in London is overcast clouds with a temperature of 10.5°C."
}
-
More Robust Tool Selection: The current implementation relies on simple keyword matching and basic prompt engineering to determine when to use a tool. This can be significantly improved using:
- OpenAI Function Calling: A more structured way to define tools and have the LLM reliably call them. This is the recommended approach for production systems.
- Few-Shot Examples: Include examples of queries and the corresponding tool calls within the system prompt provided to the LLM.
- Fine-Tuning: Fine-tune the LLM on a dataset specifically designed for this task (queries paired with correct tool calls).
-
Enhanced Error Handling: The current error handling is basic. It should be expanded to:
- Handle API rate limits gracefully (e.g., using retries with exponential backoff).
- Provide more informative error messages to the user.
- Implement fallback mechanisms (e.g., if one API is unavailable, try another).
-
Security Considerations: For a production environment, implement security best practices, including:
- Input Validation: Sanitize all user inputs to prevent injection attacks.
- Output Sanitization: Ensure that the assistant's responses do not contain sensitive information or harmful content.
- Authentication and Authorization: Implement mechanisms to control access to the API.
- Rate Limiting: Protect against denial-of-service attacks.
-
Scalability: To handle increased load, consider the following:
- Caching: Cache LLM responses and API results to reduce latency and API usage.
- Asynchronous Processing: Use a task queue (e.g., Celery or RQ) to handle API calls and LLM interactions asynchronously, improving responsiveness.
- Load Balancing: Distribute traffic across multiple instances of the application using a load balancer.