Deployment Guide

✅ Memory Optimized for Free Tier: This project now uses OpenAI API embeddings instead of local sentence-transformers, reducing memory usage from 600MB+ to ~200MB. Fully compatible with Render's 512MB free tier!

Quick Deploy Options

Option 1: Render (Recommended - Free Tier)

Steps:

Go to render.com and sign up
Click "New +" → "Web Service"
Connect your GitHub account and select FastAPI_RAG_Gateway
Render will auto-detect render.yaml
Add environment variables in the dashboard:
- OPENAI_API_KEY = your OpenAI key
- OPENROUTER_API_KEY = your OpenRouter key
Click "Create Web Service"
Wait 5-10 minutes for deployment

Your API will be live at: https://fastapi-rag-gateway.onrender.com

Note: Free tier sleeps after 15 minutes of inactivity. First request after sleep takes ~30 seconds.

Option 2: Railway (Easy, $5/month)

Steps:

Install Railway CLI:
```
npm i -g @railway/cli
```
Login and initialize:
```
railway login
railway init
```

Set environment variables:

railway variables set OPENAI_API_KEY="your_key"
railway variables set OPENROUTER_API_KEY="your_key"
railway variables set RAG_LLM_MODEL="deepseek/deepseek-chat"
railway variables set RAG_LLM_BASE_URL="https://openrouter.ai/api/v1"

Deploy:
```
railway up
```
Get your URL:
```
railway domain
```

Option 3: Fly.io (Free $5 credit)

Steps:

Install Fly CLI:
```
curl -L https://fly.io/install.sh | sh
```
Login:
```
fly auth login
```
Launch (from project directory):
```
fly launch
```
- Choose a unique app name
- Select region closest to you
- Don't deploy yet (say No)

Set secrets:

fly secrets set OPENAI_API_KEY="your_key"
fly secrets set OPENROUTER_API_KEY="your_key"

Deploy:
```
fly deploy
```

Your API will be live at: https://your-app-name.fly.dev

Option 4: Docker (Self-hosted)

Build and run locally:

docker build -t fastapi-rag-gateway .
docker run -p 8000:8000 \
  -e OPENAI_API_KEY="your_key" \
  -e OPENROUTER_API_KEY="your_key" \
  -e RAG_LLM_MODEL="deepseek/deepseek-chat" \
  -e RAG_LLM_BASE_URL="https://openrouter.ai/api/v1" \
  fastapi-rag-gateway

Deploy to any cloud with Docker support (AWS ECS, Google Cloud Run, Azure Container Instances, DigitalOcean App Platform)

Testing Your Deployment

Once deployed, test with:

curl -X POST "https://your-deployment-url.com/query" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is machine learning?"}'

Or visit https://your-deployment-url.com/docs for interactive API documentation.

Cost Comparison

Platform	Free Tier	Paid	Pros	Cons
Render	750hrs/mo	$7/mo	Easy, auto-deploy from GitHub	Sleeps after 15min inactivity
Railway	$5/mo	Fast, no sleep, great DX	No free tier
Fly.io	$5 credit	~$3/mo	Fast, global edge network	Complex pricing
Docker	Self-hosted	Varies	Full control	Requires infrastructure

Important Notes

API Keys Security: Never commit .env to GitHub. Use platform secret managers.
ChromaDB Persistence:
- On first deployment, documents will be indexed automatically
- Vector store is ephemeral on free tiers (rebuilds on restart)
- For production, use persistent volumes or external vector DB
Cold Starts:
- Free tiers may have 30-60s cold start times
- Paid tiers have instant wake-up
Memory Requirements:
- Optimized for 512MB RAM (Render free tier compatible)
- Uses OpenAI API embeddings instead of local models
- Memory footprint: ~150-250MB (well within free tier limits)

Troubleshooting

Deployment fails with "Out of Memory":

✅ FIXED: Now uses OpenAI embeddings (API-based, minimal memory)
Previous issue: sentence-transformers used 300MB+ RAM
Current solution: <50MB for embeddings, fits in 512MB free tier

"OPENAI_API_KEY not found" error:

Ensure environment variables are set in platform dashboard
Check spelling and no extra spaces

Slow first query:

Normal on free tiers (cold start)
Consider paid tier or keep-alive pings

Next Steps

Add authentication (API keys, OAuth)
Implement rate limiting
Add monitoring (Sentry, LogTail)
Set up CI/CD with GitHub Actions
Add more documents to /data folder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment Guide

Quick Deploy Options

Option 1: Render (Recommended - Free Tier)

Option 2: Railway (Easy, $5/month)

Option 3: Fly.io (Free $5 credit)

Option 4: Docker (Self-hosted)

Testing Your Deployment

Cost Comparison

Important Notes

Troubleshooting

Next Steps

FilesExpand file tree

DEPLOYMENT.md

Latest commit

History

DEPLOYMENT.md

File metadata and controls

Deployment Guide

Quick Deploy Options

Option 1: Render (Recommended - Free Tier)

Option 2: Railway (Easy, $5/month)

Option 3: Fly.io (Free $5 credit)

Option 4: Docker (Self-hosted)

Testing Your Deployment

Cost Comparison

Important Notes

Troubleshooting

Next Steps