Skip to content

Latest commit

 

History

History
200 lines (146 loc) · 5.12 KB

File metadata and controls

200 lines (146 loc) · 5.12 KB
title Tracing
description Auto-capture all LLM calls with zero code changes
Auto-capture all LLM calls with OpenTelemetry instrumentation.
<Warning>
  **Auto-tracing only works with supported LLM SDKs** (OpenAI, Anthropic, etc.) - not raw HTTP requests. If you're using an OpenAI-compatible API like OpenRouter, LiteLLM, or a self-hosted model, use the OpenAI SDK with a custom `base_url`:
</Warning>

```python
from openai import OpenAI

# OpenRouter, LiteLLM, vLLM, etc.
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",  # or your provider's URL
    api_key="your-provider-key"
)

# Now this call will be auto-traced!
response = client.chat.completions.create(model="gpt-4o", messages=[...])
```

## Automatic Tracing

```python
# Step 1: Import and init Fallom FIRST
import fallom
fallom.init()

# Step 2: Import OpenAI AFTER init
from openai import OpenAI
client = OpenAI()

# Set session context
fallom.trace.set_session("my-agent", session_id)

# Step 3: All LLM calls automatically traced with:
# - Model, tokens, latency
# - Prompts and completions
# - Your config_key and session_id
response = client.chat.completions.create(model="gpt-4o", messages=[...])
```

## Custom Metrics

Record business metrics that OTEL can't capture automatically:

```python
from fallom import trace

# Record custom metrics for this session
trace.span({
    "outlier_score": 0.8,
    "user_satisfaction": 4,
    "conversion": True
})

# Or explicitly specify session (for batch jobs)
trace.span(
    {"outlier_score": 0.8},
    config_key="my-agent",
    session_id="user123-convo456"
)
```

## Multiple A/B Tests in One Workflow

If you have multiple LLM calls and only want to A/B test some of them:

```python
import fallom

fallom.init(api_key="your-api-key")

# Set default context for all tracing
fallom.trace.set_session("my-agent", session_id)

# ... regular LLM calls (traced as "my-agent") ...

# A/B test a specific call - models.get() auto-updates context
model = models.get("summarizer-test", session_id, fallback="gpt-4o")
result = summarizer.run(model=model)

# Reset context back to default
fallom.trace.set_session("my-agent", session_id)

# ... more regular LLM calls (traced as "my-agent") ...
```
Wrap your LLM client once, all calls are automatically traced.
## OpenAI (+ OpenRouter, Azure, LiteLLM, etc.)

```typescript
import OpenAI from "openai";
import fallom from "@fallom/trace";

await fallom.init({ apiKey: "your-api-key" });

// Works with any OpenAI-compatible API
const openai = fallom.trace.wrapOpenAI(
  new OpenAI({
    baseURL: "https://openrouter.ai/api/v1", // or Azure, LiteLLM, etc.
    apiKey: "your-provider-key",
  })
);

fallom.trace.setSession("my-config", sessionId);

// Automatically traced!
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});
```

## Anthropic (Claude)

```typescript
import Anthropic from "@anthropic-ai/sdk";
import fallom from "@fallom/trace";

await fallom.init({ apiKey: "your-api-key" });

const anthropic = fallom.trace.wrapAnthropic(new Anthropic());

fallom.trace.setSession("my-config", sessionId);

// Automatically traced!
const response = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  messages: [{ role: "user", content: "Hello!" }],
});
```

## Google AI (Gemini)

```typescript
import { GoogleGenerativeAI } from "@google/generative-ai";
import fallom from "@fallom/trace";

await fallom.init({ apiKey: "your-api-key" });

const genAI = new GoogleGenerativeAI(apiKey);
const model = fallom.trace.wrapGoogleAI(
  genAI.getGenerativeModel({ model: "gemini-pro" })
);

fallom.trace.setSession("my-config", sessionId);

// Automatically traced!
const response = await model.generateContent("Hello!");
```

## What Gets Traced

For each LLM call, Fallom automatically captures:

- ✅ Model name
- ✅ Duration (latency)
- ✅ Token counts (prompt, completion, total)
- ✅ Input/output content (can be disabled)
- ✅ Errors
- ✅ Config key + session ID (for A/B analysis)
- ✅ Prompt key + version (when using prompt management)

## Custom Metrics

Record business metrics for your A/B tests:

```typescript
import { trace } from "@fallom/trace";

trace.span({
  outlier_score: 0.8,
  user_satisfaction: 4,
  conversion: true,
});
```

Next Steps

Test different models with your traced calls. Manage and A/B test your prompts.