A lightweight, preemptive rate limiter and concurrency manager for OpenAI's API
- 🎯 Preemptive Token Estimation: Attempts to predict token usage before making API calls
- 🔄 Smart Rate Limiting: Manages requests and tokens per minute to avoid API limits
- ⚡ Concurrent Request Handling: Efficient parallel processing with semaphore control
- 💰 Built-in Cost Tracking: Real-time cost estimation for better budget management
- 🎚️ Fine-tuned Control: Adjustable parameters for optimal performance
pip install concurrent-openai
- Set up your environment:
echo "OPENAI_API_KEY=your_api_key" >> .env
# OR
export OPENAI_API_KEY=your_api_key
Note: You can also pass the api_key
to the ConcurrentOpenAI
client.
- Start making requests:
from concurrent_openai import ConcurrentOpenAI
client = ConcurrentOpenAI(
api_key="your-api-key", # not required if OPENAI_API_KEY env var is set
max_concurrent_requests=5,
requests_per_minute=200,
tokens_per_minute=40000
)
response = client.create(
messages=[{"role": "user", "content": "Hello!"}],
model="gpt-4o",
temperature=0.7
)
print(response.content)
from openai import AsyncOpenAI
from concurrent_openai import ConcurrentOpenAI
openai_client = AsyncOpenAI(api_key="your-api-key")
client = ConcurrentOpenAI(
client=openai_client,
max_concurrent_requests=5,
requests_per_minute=200,
tokens_per_minute=40000
)
The library supports seamless integration with Azure OpenAI services:
from openai import AsyncAzureOpenAI
from concurrent_openai import ConcurrentOpenAI
azure_client = AsyncAzureOpenAI(
azure_endpoint="your-azure-endpoint",
api_key="your-azure-api-key",
api_version="2024-02-01"
)
client = ConcurrentOpenAI(
client=azure_client,
max_concurrent_requests=5,
requests_per_minute=60,
tokens_per_minute=10000
)
response = await client.create(
messages=[{"role": "user", "content": "Hello!"}],
model="gpt-35-turbo", # Use your deployed model name
temperature=0.7
)
from concurrent_openai import ConcurrentOpenAI
messages_list = [
[{"role": "user", "content": f"Process item {i}"}]
for i in range(10)
]
client = ConcurrentOpenAI(api_key="your-api-key")
responses = client.create_many(
messages_list=messages_list,
model="gpt-40",
temperature=0.7
)
for resp in responses:
if resp.is_success:
print(resp.content)
client = ConcurrentOpenAI(
api_key="your-api-key",
input_token_cost=2.5 / 1_000_000, # see https://openai.com/api/pricing/ for the latest costs
output_token_cost=10 / 1_000_000
)
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.