LiteLLM Cache is a proxy server designed to cache your LLM requests, helping to reduce costs and improve efficiency.
- Docker Compose
- Docker
-
Configure Settings:
- Navigate to
./config.yaml
and update the configuration as per your requirements. For more information, visit LiteLLM Documentation.
- Navigate to
-
Prepare Environment Variables:
- Create a
.env
file from the.env.sample
file. Adjust the details in.env
to match yourconfig.yaml
settings.
- Create a
-
Start the Docker Container:
docker-compose up -d
-
Update Your LLM Server URL:
- Change the LLM calling server URL in your application to
http://0.0.0.0:4000
.
For example, using the OpenAI Python SDK:
from openai import OpenAI llm = OpenAI( base_url='http://0.0.0.0:4000' )
- Change the LLM calling server URL in your application to
With these steps, your LLM requests will be routed through the LiteLLM Cache proxy server, optimizing performance and reducing costs.