Skip to content

Latest commit

 

History

History
180 lines (138 loc) · 6.11 KB

File metadata and controls

180 lines (138 loc) · 6.11 KB

cascadeflow API Reference

Complete API documentation for cascadeflow Python and TypeScript SDKs.

📚 Core Documentation

Python API

TypeScript API

For complete TypeScript API documentation, see the TypeScript Package README.

Quick Links:

🚀 Quick Reference

Python

from cascadeflow import CascadeAgent, ModelConfig

# Create agent
agent = CascadeAgent(models=[
    ModelConfig(name="gpt-4o-mini", provider="openai", cost=0.00015),
    ModelConfig(name="gpt-4o", provider="openai", cost=0.00625)
])

# Run query
result = await agent.run("What is Python?")
print(f"Cost: ${result.total_cost:.6f}")
print(f"Savings: {result.cost_saved_percentage:.1f}%")

TypeScript

import { CascadeAgent } from '@cascadeflow/core';

// Create agent
const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015 },
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625 }
  ]
});

// Run query
const result = await agent.run('What is TypeScript?');
console.log(`Cost: $${result.totalCost.toFixed(6)}`);
console.log(`Savings: ${result.savingsPercentage}%`);

📖 API Structure

Core Classes

Class Python TypeScript Description
CascadeAgent cascadeflow.CascadeAgent @cascadeflow/core.CascadeAgent Main agent for cascading
ModelConfig cascadeflow.ModelConfig ModelConfig interface Model configuration
CascadeResult cascadeflow.CascadeResult CascadeResult interface Query result with metrics
QualityConfig cascadeflow.QualityConfig QualityConfig interface Quality validation settings

Configuration

Setting Type Default Description
models List[ModelConfig] Required Models to cascade through (sorted by cost)
quality.threshold float 0.7 Minimum quality score to accept (0-1)
cascade.maxBudget float None Maximum cost per query in USD
cascade.maxRetries int 2 Max retries per model on failure
cascade.timeout int 30 Timeout per model in seconds
cascade.trackCosts bool True Enable detailed cost tracking

Result Fields

Field Type Description
content str Generated response text
model_used str Model that produced final response
total_cost float Total cost in USD
latency_ms int Total latency in milliseconds
cascaded bool Whether cascade was used
draft_accepted bool If cascaded, was draft accepted
quality_score float Quality score (0-1)
cost_saved float Cost saved vs always using best model
savings_percentage float Savings as percentage

🔧 Provider Support

cascadeflow supports 7+ AI providers out of the box:

Provider Status Models Supported
OpenAI ✅ Stable GPT-4o, GPT-4o-mini, GPT-3.5-turbo, GPT-5
Anthropic ✅ Stable Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku
Groq ✅ Stable Llama 3.1, Mixtral, Gemma
Ollama ✅ Stable All Ollama models (local deployment)
vLLM ✅ Stable Custom models (local/cloud deployment)
Together ✅ Stable 100+ open-source models
Hugging Face ✅ Stable Inference API models

📊 Advanced Features

Streaming

# Python
async for event in agent.stream("Tell me a story"):
    if event.type == "content_delta":
        print(event.content, end="", flush=True)
// TypeScript
for await (const event of agent.stream('Tell me a story')) {
  if (event.type === 'content_delta') {
    process.stdout.write(event.content);
  }
}

Tool Calling

# Python
tools = [{"name": "get_weather", "description": "Get weather", "parameters": {...}}]
result = await agent.run("What's the weather?", tools=tools)
// TypeScript
const tools = [{ name: 'get_weather', description: 'Get weather', parameters: {...} }];
const result = await agent.run('What\'s the weather?', { tools });

Presets

# Python
from cascadeflow import PRESET_BEST_OVERALL, PRESET_ULTRA_FAST

agent = CascadeAgent.from_preset(PRESET_BEST_OVERALL)
// TypeScript
import { CascadeAgent, PRESET_BEST_OVERALL } from '@cascadeflow/core';

const agent = new CascadeAgent(PRESET_BEST_OVERALL);

🔍 See Also

User Guides

Examples

💡 Need Help?