Secret AI Caddy - Advanced API Gateway

A sophisticated Caddy middleware that provides secure API key authentication, intelligent token usage metering, x402 prepaid payment protocol, and comprehensive metrics collection for AI/ML API gateways. The middleware validates API keys against multiple sources while tracking detailed usage statistics and reporting to blockchain-based smart contracts.

⭐ What's New in v2.0

Accurate Token Counting - We've completely reimplemented token counting to fix the 2-2.5x inflation issue:

✅ Model-Specific Tokenizers - Uses HuggingFace and SentencePiece tokenizers for accurate counting
✅ 90-95% Accuracy - Matches actual model token usage for billing-grade precision
✅ Automatic Model Detection - Detects model from request JSON and applies correct tokenizer
✅ Multi-Model Support - Llama 2/3/3.3, Mistral, Mixtral, Falcon, BERT, and more
✅ Pure Go Implementation - No CGO or external dependencies required
✅ Smart Fallback - Gracefully handles unknown models with conservative estimation

Before: (chars/4 + words×1.33)/2 inflated counts by 2-2.5x After: Real tokenization matching actual AI model usage

See detailed changes in METERING.md

🎯 Project Purpose

This middleware implements a comprehensive API gateway solution designed for high-security AI/ML environments requiring:

Multi-tiered Authentication - Master keys, file-based keys, and Secret Network smart contracts
x402 Payment Protocol - Portal-based prepaid billing for AI agents with DevPortal balance checks, 402 payment challenges, and async usage reporting
Accurate Token Metering - Model-specific token counting for precise billing and usage tracking
Comprehensive Metrics - Performance monitoring, usage analytics, and operational insights
Blockchain Integration - Decentralized usage reporting via Secret Network smart contracts
Production-Ready Security - Encrypted communication, secure caching, and audit logging

📚 Documentation

📐 Architecture - Complete system architecture and component design
💳 x402 Payment Protocol - Portal-based balance checking, 402 challenges, and usage reporting
⚖️ Metering & Metrics - Token counting, usage tracking, and metrics collection

🏗️ Architecture Overview

graph TB
    subgraph "Client Layer"
        C[AI/ML Clients<br/>with API Keys]
        A[AI Agents<br/>with Bearer Tokens]
    end

    subgraph "Caddy Gateway"
        subgraph "Middleware Pipeline"
            ROUTE{Master Key?}
            AUTH[API Key Authentication]
            X402[x402 Portal Path]
            METER[Token Metering]
            METRICS[Metrics Collection]
            PROXY[Reverse Proxy]
        end
    end

    subgraph "Authentication Sources"
        MK[Master Keys]
        MKF[Master Keys File]
        CACHE[Cached Results]
        SC[Secret Network<br/>Smart Contract]
    end

    subgraph "x402 Components"
        PORTAL[Portal Client]
        CHALLENGE[402 Challenge Builder]
        REPORT[Async Usage Reporter]
    end

    subgraph "DevPortal"
        BAL[Balance API]
        USAGE[Usage Reporting API]
        PAY[Payment Processing]
    end

    subgraph "AI/ML Services"
        AI1[OpenAI API]
        AI2[Ollama]
        AI3[TorchServe]
        AI4[Custom ML APIs]
    end

    subgraph "Reporting & Analytics"
        BLOCKCHAIN[Secret Network<br/>Usage Reporting]
        METRICS_API[Metrics Endpoint<br/>/metrics]
    end

    C -->|HTTP + API Key| ROUTE
    A -->|HTTP + Bearer Token| ROUTE
    ROUTE -->|Yes| AUTH
    ROUTE -->|No, x402 enabled| X402
    AUTH --> MK
    AUTH --> MKF
    AUTH --> CACHE
    AUTH -->|Cache Miss| SC
    AUTH -->|Authorized| METER
    X402 --> PORTAL
    PORTAL -->|Check Balance| BAL
    PORTAL -->|Insufficient| CHALLENGE
    PORTAL -->|Sufficient| METER
    METER -->|Count Tokens| METRICS
    METER --> PROXY
    PROXY --> AI1
    PROXY --> AI2
    PROXY --> AI3
    PROXY --> AI4
    REPORT -->|Token Counts| USAGE
    A -->|Top Up| PAY

    METRICS -->|Usage Data| BLOCKCHAIN
    METRICS --> METRICS_API

🤖 Supported AI Models (v2.0)

The gateway provides accurate token counting for these models using industry-standard tokenizers:

Fully Supported Models (90-95% Accuracy)

Model Family	Variants	Tokenizer
Llama	Llama 2 (7B, 13B, 70B) Llama 3 (8B, 70B) Llama 3.3 (70B)	HuggingFace
Mistral	Mistral 7B v0.1/v0.2 Mixtral 8x7B	HuggingFace
Falcon	Falcon 7B, 40B, 180B	HuggingFace
BERT	BERT base, large	HuggingFace

Model Detection

The system automatically detects the model from the model field in your JSON request:

{
  "model": "llama3.3:70b",     // Detected and uses Llama-3.3 tokenizer
  "prompt": "Your prompt here"
}

Model name variations handled:

llama3.3:70b → llama3.3
mistral-7b-v0.1 → mistral-7b
Case-insensitive matching
Automatic normalization

Unknown Models

For models not in the supported list, the system uses a conservative chars/4 fallback estimation (60-70% accuracy) - no configuration needed.

Adding Custom Models

To add support for custom HuggingFace models, use the configuration:

preload_models llama-2,mistral,your-custom-model

Any model available on HuggingFace with a tokenizer.json file can be used.

✨ Key Features

🔐 Advanced Authentication

Multi-tier validation with configurable precedence and fallback
Secure caching with SHA256 hashing and configurable TTL
Secret Network integration with encrypted blockchain communication
Dynamic key rotation via file-based keys without service restart
Thread-safe operations with optimized read-write mutex usage

⚖️ Accurate Token Metering (v2.0)

Model-specific tokenization using HuggingFace and SentencePiece libraries
90-95% accuracy matching actual AI model token usage for billing-grade precision
Automatic model detection from JSON request body
Supported models: Llama 2/3/3.3, Mistral, Mixtral, Falcon, BERT, and custom HuggingFace models
Lazy-loading with caching - tokenizers load once and are reused for performance
Request/response tracking with comprehensive body analysis
Usage accumulation per API key and per model with thread-safe operations
Resilient reporting with retry logic and failed report persistence
Smart fallback to conservative estimation for unknown models

📊 Comprehensive Metrics

Real-time monitoring of requests, tokens, performance, and errors
HTTP metrics endpoint at /metrics with detailed JSON output
Cache performance tracking including hit rates and operation times
Token usage analytics with input/output token breakdowns
System health indicators for operational monitoring

💳 x402 Payment Protocol (Portal-Based)

Stateless proxy — Caddy delegates all balance management to DevPortal
Per-request balance check via DevPortal API with service-key authentication
402 Payment Required responses with USDC-denominated challenge payloads and topup URLs
Async usage reporting — token counts sent to DevPortal without blocking the response
Fail-closed — returns 503 when DevPortal is unreachable (no unpaid usage)
Master key bypass — admin/operator keys skip portal balance checks
Configurable threshold — minimum balance required to serve requests (x402_min_balance_usdc)
Portal owns pricing — Caddy reports raw token counts, DevPortal computes cost

See x402_Caddy_Implementation.md for full documentation and x402-portal.md for the end-to-end payment flow design.

🚫 URL Filtering & Security

Pattern-based blocking with configurable URL patterns via environment variables
Early request filtering for performance optimization (before API key validation)
Comprehensive logging of blocked requests with pattern matching details
HTTP 403 Forbidden responses for blocked requests with clear error messages
Metrics integration for tracking blocked request statistics

🚀 Production Features

Environment variable support for secure configuration management
URL filtering with configurable blocked URL patterns via BLOCK_URLS environment variable
Graceful error handling with detailed logging and audit trails
Resource management with configurable limits and cleanup procedures
Docker-ready deployment with multi-stage builds and health checks

🛠️ Building and Testing

Prerequisites

Go 1.26+
Docker & Docker Compose
Git

Build Custom Caddy

The project uses a multi-stage Dockerfile to build Caddy with the custom module:

# Build the custom Caddy image
docker build -t secret-reverse-proxy:latest .

The Dockerfile:

Builder Stage: Uses Go 1.26+ to install xcaddy and build Caddy with the secret-reverse-proxy module
Runtime Stage: Creates lightweight Alpine-based runtime with security hardening
Security Features: Non-root user, minimal dependencies, health checks

Test Environment Setup

1. Start Test Environment

# Start all services including echo server for testing
docker-compose up --build

The docker-compose setup includes:

caddy: Custom Caddy with secret-reverse-proxy module
echo-server: Simple HTTP echo service for testing backend responses
networking: Isolated testnet for secure communication

2. Test Different Scenarios

Valid API Key Test:

curl -H "Authorization: Bearer bWFzdGVyQHNjcnRsYWJzLmNvbTpTZWNyZXROZXR3b3JrTWFzdGVyS2V5X18yMDI1" \
     -H "Content-Type: application/json" \
     -d '{"model": "llama3.3:70b", "prompt": "Hello, world!", "max_tokens": 100}' \
     http://localhost:8085/
# Expected: 200 OK with accurate token count in logs

Invalid API Key Test:

curl -H "Authorization: Bearer invalid-key-123" \
     http://localhost:8085/
# Expected: 401 Unauthorized

Missing Authorization Test:

curl http://localhost:8085/
# Expected: 401 Unauthorized

Accurate Token Counting Test (Llama):

curl -H "Authorization: Bearer bWFzdGVyQHNjcnRsYWJzLmNvbTpTZWNyZXROZXR3b3JrTWFzdGVyS2V5X18yMDI1" \
     -H "Content-Type: application/json" \
     -d '{"model": "llama3.3:70b", "prompt": "Write a haiku about programming", "max_tokens": 100}' \
     http://localhost:8085/
# Uses Llama-3.3 tokenizer for accurate counting

Accurate Token Counting Test (Mistral):

curl -H "Authorization: Bearer bWFzdGVyQHNjcnRsYWJzLmNvbTpTZWNyZXROZXR3b3JrTWFzdGVyS2V5X18yMDI1" \
     -H "Content-Type: application/json" \
     -d '{"model": "mistral-7b", "messages": [{"role": "user", "content": "Explain quantum computing"}]}' \
     http://localhost:8085/chat/completions
# Uses Mistral tokenizer for accurate counting

Unknown Model Test (Fallback):

curl -H "Authorization: Bearer bWFzdGVyQHNjcnRsYWJzLmNvbTpTZWNyZXROZXR3b3JrTWFzdGVyS2V5X18yMDI1" \
     -H "Content-Type: application/json" \
     -d '{"model": "custom-gpt-x", "prompt": "Test prompt"}' \
     http://localhost:8085/
# Uses fallback chars/4 estimation for unknown models

Metrics Check:

curl http://localhost:8085/metrics

URL Filtering Test (Blocked):

# Set environment variable for URL blocking
export BLOCK_URLS="/admin,/config,/internal"

# This request will be blocked with HTTP 403
curl -H "Authorization: Bearer bWFzdGVyQHNjcnRsYWJzLmNvbTpTZWNyZXROZXR3b3JrTWFzdGVyS2V5X18yMDI1" \
     http://localhost:8085/admin/users

Configuration Details

The Caddyfile-test demonstrates comprehensive configuration:

:80 {
    secret_reverse_proxy {
        # Authentication configuration
        API_MASTER_KEY {env.SECRET_API_MASTER_KEY}
        master_keys_file /etc/caddy/master_keys.txt
        secret_node {env.SECRET_NODE}
        contract_address {env.SECRET_CONTRACT}
        secret_chain_id {env.SECRET_CHAIN_ID}
        # permit_file is optional if SECRETAI_PERMIT_TYPE, SECRETAI_PERMIT_PUBKEY,
        # and SECRETAI_PERMIT_SIG env vars are set instead
        permit_file /etc/caddy/permit.json

        # Metering configuration
        metering {env.METERING}
        metering_interval {env.METERING_INTERVAL}
        metering_url {env.METERING_URL}

        # Token counting settings (v2.0)
        max_body_size 2097152          # 2MB max body size
        token_counting_mode accurate   # Uses model-specific tokenizers
        tokenizer_cache_dir /tmp/tokenizers  # Cache directory for tokenizers
        preload_models llama-2,mistral # Pre-cache common models for fast startup

        # Reporting settings
        max_retries 5                  # retry attempts for failed reports
        retry_backoff 300s             # backoff between retries

        # Metrics configuration
        enable_metrics true            # enable /metrics endpoint
        metrics_path /metrics          # metrics endpoint path
    }

    reverse_proxy echo-server:80 {
        health_uri /health
        health_interval 30s
        health_timeout 10s
    }
}

x402 Payment Protocol Testing

The x402 test suite includes unit tests and integration tests with a mock DevPortal:

cd secret-reverse-proxy

# Run x402 unit tests (portal client, challenge builder, USDC conversion)
go test -v ./x402/

# Run x402 integration tests (full middleware flow with mock portal)
go test -v -run TestX402

The integration tests verify: insufficient balance (402), sufficient balance (200 + usage report), master key bypass, portal unreachable (503), and partial balance deficit calculation. See x402_Caddy_Implementation.md for the full test plan including Docker-based end-to-end testing.

Development Testing

Unit Tests

cd secret-reverse-proxy
go test -v ./...

Integration Tests

go test -v -tags=integration ./...

Test Coverage

go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

Specific Component Tests

# Test API key validation
go test -v ./validators/

# Test x402 payment protocol
go test -v ./x402/...

# Test token counting
go test -v -run TestTokenCounter

# Test metering functionality
go test -v -run TestMetering

📋 Configuration Reference

Environment Variables

Variable	Description	Example
`SECRET_API_MASTER_KEY`	Primary API key for authentication	`your-secure-master-key`
`SECRET_NODE`	Secret Network LCD endpoint	`lcd.secret.tactus.starshell.net`
`SECRET_CHAIN_ID`	Secret Network chain identifier	`secret-4`
`SECRET_CONTRACT`	Smart contract address for validation	`secret18xpp2kmkk7g8xzx24wm5zjstw9tjv6g3xle2vjm`
`METERING`	Enable/disable usage metering	`1` or `true`
`METERING_INTERVAL`	Reporting interval	`5m`, `1h`
`METERING_URL`	Endpoint for usage reports	`https://api.example.com`
`BLOCK_URLS`	Comma-separated list of URL patterns to block	`/admin,/config,/internal`
`SECRETAI_MASTER_KEYS`	Comma-separated list of master API keys. Used as an alternative (or in addition) to `master_keys_file`.	`key1,key2,key3`
`SECRETAI_PERMIT_TYPE`	Permit public key type. Required when no `permit_file` is configured — used to construct a permit on the fly for retrieving API keys from KMS.	`tendermint/PubKeySecp256k1`
`SECRETAI_PERMIT_PUBKEY`	Permit public key value. Required when no `permit_file` is configured.	`Aur9D8RLq...`
`SECRETAI_PERMIT_SIG`	Permit signature. Required when no `permit_file` is configured.	`TeNtblPmo...`
`DEVPORTAL_URL`	DevPortal base URL for balance checks and usage reporting	`https://devportal.example.com`
`DEVPORTAL_SERVICE_KEY`	Shared secret for Caddy-to-DevPortal service authentication	`caddy-secret-key-123`

Caddyfile Directives

Directive	Type	Description	Default
`API_MASTER_KEY`	string	Primary master key	None
`master_keys_file`	path	Path to a file containing additional master keys (one per line). Optional — if not configured, `SECRETAI_MASTER_KEYS` env var can be used instead.	`""`
`permit_file`	path	Path to a JSON file containing the Secret Network permit configuration used to retrieve Secret AI API Keys from KMS. Optional — if not configured, the system constructs a permit on the fly from `SECRETAI_PERMIT_TYPE`, `SECRETAI_PERMIT_PUBKEY`, and `SECRETAI_PERMIT_SIG` env vars.	None
`contract_address`	string	Smart contract address	Required
`secret_node`	string	Secret Network node	Required
`secret_chain_id`	string	Chain ID	Required
`metering`	boolean	Enable usage metering	`false`
`metering_interval`	duration	Reporting frequency	`10m`
`metering_url`	string	Usage reporting endpoint	`""`
`max_body_size`	bytes	Max request body size	`10MB`
`token_counting_mode`	string	Token counting mode (always uses accurate v2.0)	`accurate`
`tokenizer_cache_dir`	path	Directory for caching tokenizer files	`/tmp/tokenizers`
`preload_models`	string	Comma-separated models to pre-cache	`llama-2,mistral`
`max_retries`	int	Failed report retry attempts	`3`
`retry_backoff`	duration	Retry delay	`5m`
`enable_metrics`	boolean	Enable metrics collection	`false`
`metrics_path`	string	Metrics HTTP endpoint	`/metrics`
`x402_enabled`	boolean	Enable portal-based x402 payment protocol	`false`
`devportal_url`	string	DevPortal base URL	Required if x402 enabled
`devportal_service_key`	string	Shared secret for service-to-service auth	Required if x402 enabled
`x402_min_balance_usdc`	string	Minimum agent balance in USDC (e.g., `"0.01"`)	Required if x402 enabled
`x402_topup_url`	string	Override topup URL in 402 responses	`{devportal_url}/api/agent/add-funds`

🚀 Production Deployment

Docker Deployment

Basic Deployment

docker run -d \
  --name secret-ai-caddy \
  -p 80:80 -p 443:443 \
  -e SECRET_API_MASTER_KEY="your-production-key" \
  -e SECRET_NODE="lcd.secret.tactus.starshell.net" \
  -e SECRET_CONTRACT="secret18xpp2kmkk7g8xzx24wm5zjstw9tjv6g3xle2vjm" \
  -e SECRET_CHAIN_ID="secret-4" \
  -e METERING=true \
  -e METERING_INTERVAL="5m" \
  -e METERING_URL="https://your-metrics-api.com" \
  -e BLOCK_URLS="/admin,/config,/internal" \
  secret-reverse-proxy:latest

Production with Volumes

docker run -d \
  --name secret-ai-caddy \
  --restart unless-stopped \
  -p 80:80 -p 443:443 \
  -v ./Caddyfile:/etc/caddy/Caddyfile \
  -v ./master_keys.txt:/etc/caddy/master_keys.txt \
  -v ./permit.json:/etc/caddy/permit.json \
  -v caddy_data:/data \
  -v caddy_config:/config \
  -e SECRET_API_MASTER_KEY="your-production-key" \
  secret-reverse-proxy:latest

Docker Compose Production

version: '3.8'
services:
  secret-ai-caddy:
    image: secret-reverse-proxy:latest
    ports:
      - "80:80"
      - "443:443"
    environment:
      - SECRET_API_MASTER_KEY=${SECRET_API_MASTER_KEY}
      - SECRET_NODE=${SECRET_NODE}
      - SECRET_CONTRACT=${SECRET_CONTRACT}
      - SECRET_CHAIN_ID=${SECRET_CHAIN_ID}
      - METERING=true
      - METERING_INTERVAL=5m
      - METERING_URL=${METERING_URL}
      - BLOCK_URLS=${BLOCK_URLS}
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
      - ./master_keys.txt:/etc/caddy/master_keys.txt
      - ./permit.json:/etc/caddy/permit.json
      - caddy_data:/data
      - caddy_config:/config
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/health"]
      interval: 30s
      timeout: 10s
      retries: 3

volumes:
  caddy_data:
  caddy_config:

Security Best Practices

API Key Security
- Use environment variables for sensitive keys
- Rotate master keys regularly
- Implement key versioning
- Monitor key usage patterns
File Security
- Secure master keys file with 600 permissions
- Use separate permit files per environment
- Regular backup of configuration files
Network Security
- Always use HTTPS in production
- Implement proper firewall rules
- Use private networks for backend communication
- Enable rate limiting per API key
Monitoring & Alerting
- Monitor authentication failure rates
- Set up alerts for contract query failures
- Track unusual usage patterns
- Monitor system resource usage
Operational Security
- Regular security updates
- Log analysis and monitoring
- Incident response procedures
- Backup and recovery plans

🔍 Monitoring & Troubleshooting

Health Checks

System Health:

curl http://localhost:8085/health

Metrics Overview:

curl http://localhost:8085/metrics | jq

Common Issues

Authentication Failures

# Check logs for details
docker logs caddy-reverse-proxy

# Verify environment variables
docker exec caddy-reverse-proxy env | grep SECRET

Contract Query Issues

# Test network connectivity
curl https://lcd.secret.tactus.starshell.net/status

# Verify contract address
curl "https://lcd.secret.tactus.starshell.net/compute/v1beta1/code_hash/by_contract_address/YOUR_CONTRACT"

Token Counting Problems

# Check metering logs for tokenizer loading
docker logs caddy-reverse-proxy 2>&1 | grep -i tokenizer

# Check for accurate token counting
docker logs caddy-reverse-proxy 2>&1 | grep -i "Used accurate tokenizer"

# Test with model-specific request
curl -H "Authorization: Bearer YOUR_KEY" \
     -H "Content-Type: application/json" \
     -d '{"model": "llama3.3:70b", "prompt": "test"}' \
     http://localhost:8085/

# Verify tokenizer cache
docker exec caddy-reverse-proxy ls -la /tmp/tokenizers

Common Token Counting Issues:

If seeing "Failed to load tokenizer" warnings, check network connectivity for HuggingFace downloads
Tokenizers are cached after first use - subsequent requests should be fast
Unknown models automatically fall back to conservative chars/4 estimation
Check preload_models configuration to pre-cache commonly used models

URL Filtering Issues

# Check if BLOCK_URLS is set
docker exec caddy-reverse-proxy env | grep BLOCK_URLS

# View filtering logs
docker logs caddy-reverse-proxy 2>&1 | grep -i "blocked"

# Test blocked URL
curl -H "Authorization: Bearer YOUR_KEY" \
     http://localhost:8085/admin/test
# Should return HTTP 403 Forbidden

# Test allowed URL  
curl -H "Authorization: Bearer YOUR_KEY" \
     http://localhost:8085/api/test
# Should proceed to API key validation

Debug Configuration

{
    debug
    log {
        output stdout
        format console
        level DEBUG
    }
}

📊 Performance Characteristics

v2.0 Performance Metrics

Authentication Latency: <1ms for cache hits, <500ms for contract queries
Token Counting (Accurate Mode):
- First request with model: 50-200ms (downloads and caches tokenizer from HuggingFace)
- Cached tokenizer: 1-5ms per request (accurate tokenization)
- Unknown model fallback: <1ms (simple chars/4 estimation)
- Preloaded models: 1-5ms from first request
Memory Usage:
- ~1KB per 1000 cached API keys
- ~5-15MB per cached tokenizer (depends on model)
- Typical deployment: 20-50MB for 2-3 common models
Throughput: Supports 10k+ RPS with proper caching
Cache Efficiency:
- API keys: 95%+ hit rate for stable key sets
- Tokenizers: 100% hit rate after initial load (cached permanently)

Token Counting Accuracy

Model Type	Accuracy vs Actual	Method
Llama 2/3/3.3	90-95%	HuggingFace tokenizer
Mistral/Mixtral	90-95%	HuggingFace tokenizer
Falcon	90-95%	HuggingFace tokenizer
BERT	90-95%	HuggingFace tokenizer
Unknown models	60-70%	Chars/4 fallback

Before v2.0: Heuristic method inflated counts by 2-2.5x After v2.0: Within 5-10% of actual usage for supported models

🔄 Migrating from v1.x to v2.0

What Changed

Token Counting System:

Old heuristic (chars/4 + words×1.33)/2 replaced with model-specific tokenizers
Token counts will be 40-60% lower for most requests (more accurate)
Per-model usage tracking now available

Migration Steps

Update Docker Image

docker pull secret-reverse-proxy:latest
# or rebuild: docker build -t secret-reverse-proxy:latest .

Update Configuration (Optional)

secret_reverse_proxy {
    # ... existing config ...

    # New optional settings (v2.0)
    tokenizer_cache_dir /tmp/tokenizers    # Default location
    preload_models llama-2,mistral          # Common models
}

Monitor First Deployment

# Watch for tokenizer downloads (first time only)
docker logs -f caddy-reverse-proxy | grep tokenizer

# Verify accurate counting
docker logs -f caddy-reverse-proxy | grep "Used accurate tokenizer"

Expect Lower Token Counts
- Old counts were inflated 2-2.5x
- New counts are 90-95% accurate for supported models
- Update billing expectations accordingly

Rollback Plan

If you need to rollback:

# Use previous image version
docker pull secret-reverse-proxy:v1.x
docker-compose up -d

🤝 Contributing

Fork the repository
Create a feature branch
Implement changes with tests
Update documentation
Submit a pull request

📄 License

[Add appropriate license information]

🆘 Support

For issues and questions:

Documentation: See Architecture and Metering guides
Issues: GitHub Issues tracker
Security: Contact alexh@scrtlabs.com for security-related issues

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github/workflows		.github/workflows
caddy-mock-llm		caddy-mock-llm
data		data
design		design
scripts		scripts
secret-reverse-proxy		secret-reverse-proxy
test		test
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Caddyfile		Caddyfile
Caddyfile-fetch		Caddyfile-fetch
Caddyfile-test		Caddyfile-test
Dockerfile		Dockerfile
METERING.md		METERING.md
README.md		README.md
docker-compose-jedi-ollama.yaml		docker-compose-jedi-ollama.yaml
docker-compose-jedi.yaml		docker-compose-jedi.yaml
docker-compose-rytn.yaml		docker-compose-rytn.yaml
docker-compose-secretvm.yaml		docker-compose-secretvm.yaml
docker-compose.yaml		docker-compose.yaml
trigger-action.txt		trigger-action.txt
x402-portal.md		x402-portal.md
x402-updated.md		x402-updated.md
x402.md		x402.md
x402_Caddy_Implementation.md		x402_Caddy_Implementation.md

Folders and files

Latest commit

History

Repository files navigation

Secret AI Caddy - Advanced API Gateway

⭐ What's New in v2.0

🎯 Project Purpose

📚 Documentation

🏗️ Architecture Overview

🤖 Supported AI Models (v2.0)

Fully Supported Models (90-95% Accuracy)

Model Detection

Unknown Models

Adding Custom Models

✨ Key Features

🔐 Advanced Authentication

⚖️ Accurate Token Metering (v2.0)

📊 Comprehensive Metrics

💳 x402 Payment Protocol (Portal-Based)

🚫 URL Filtering & Security

🚀 Production Features

🛠️ Building and Testing

Prerequisites

Build Custom Caddy

Test Environment Setup

1. Start Test Environment

2. Test Different Scenarios

Configuration Details

x402 Payment Protocol Testing

Development Testing

Unit Tests

Integration Tests

Test Coverage

Specific Component Tests

📋 Configuration Reference

Environment Variables

Caddyfile Directives

🚀 Production Deployment

Docker Deployment

Basic Deployment

Production with Volumes

Docker Compose Production

Security Best Practices

🔍 Monitoring & Troubleshooting

Health Checks

Common Issues

Debug Configuration

📊 Performance Characteristics

v2.0 Performance Metrics

Token Counting Accuracy

🔄 Migrating from v1.x to v2.0

What Changed

Migration Steps

Rollback Plan

🤝 Contributing

📄 License

🆘 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages