Cache creation cost underestimated: 1-hour cache writes priced at 5m rate (1.25x) instead of 2x

## Summary

When using `calculate` mode, ccusage prices **all** `cache_creation_input_tokens` at a single `cache_creation_input_token_cost` rate from LiteLLM, which corresponds to the **5-minute cache write** multiplier (1.25× base input). However, Claude Code predominantly uses **1-hour caching**, which [Anthropic prices at **2× base input**](https://platform.claude.com/docs/en/about-claude/pricing#prompt-caching) — a 60% higher rate.

## Anthropic official pricing

From [Anthropic's pricing page](https://platform.claude.com/docs/en/about-claude/pricing#prompt-caching):

| Cache operation | Multiplier | Duration |
|:---|:---|:---|
| 5-minute cache write | 1.25× base input price | Cache valid for 5 minutes |
| 1-hour cache write | 2× base input price | Cache valid for 1 hour |
| Cache read (hit) | 0.1× base input price | Same duration as the preceding write |

Full model pricing table (relevant columns):

| Model | Base Input | 5m Cache Writes | 1h Cache Writes | Cache Hits |
|:---|:---|:---|:---|:---|
| Claude Opus 4.6 | $5/MTok | $6.25/MTok | **$10/MTok** | $0.50/MTok |
| Claude Sonnet 4.6 | $3/MTok | $3.75/MTok | **$6/MTok** | $0.30/MTok |
| Claude Sonnet 4.5 | $3/MTok | $3.75/MTok | **$6/MTok** | $0.30/MTok |
| Claude Haiku 4.5 | $1/MTok | $1.25/MTok | **$2/MTok** | $0.10/MTok |

## Data available in JSONL

Claude Code's JSONL files already include a `cache_creation` breakdown inside the `usage` object that distinguishes the two durations:

```json
{
  "usage": {
    "input_tokens": 3,
    "output_tokens": 10,
    "cache_creation_input_tokens": 23566,
    "cache_read_input_tokens": 19357,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 0,
      "ephemeral_1h_input_tokens": 23566
    }
  }
}
```

In my dataset (~40k usage records), the vast majority of cache creation tokens are 1-hour:

| Model | 5m tokens | 1h tokens | % 1h |
|:---|:---|:---|:---|
| claude-opus-4-6 | 9.6M | 116.5M | 92% |
| claude-sonnet-4-5-20250929 | 0 | 7.8M | 100% |
| claude-sonnet-4-6 | 2.1M | 0 | 0% |
| claude-haiku-4-5-20251001 | 15.8M | 30.9M | 66% |

## Current behavior in ccusage

In [`packages/internal/src/pricing.ts`](https://github.com/ryoppippi/ccusage/blob/main/packages/internal/src/pricing.ts), `calculateCostFromPricing` uses a single `cache_creation_input_token_cost` (sourced from LiteLLM) for all cache creation tokens:

```ts
const cacheCreationCost = calculateTieredCost(
    tokens.cache_creation_input_tokens,
    pricing.cache_creation_input_token_cost,            // 1.25x rate
    pricing.cache_creation_input_token_cost_above_200k_tokens,
);
```

There is no reference to `ephemeral_5m_input_tokens` or `ephemeral_1h_input_tokens` anywhere in the codebase (`gh search code "ephemeral" --repo ryoppippi/ccusage` returns 0 results).

## Impact

Cost comparison on my usage data:

| Model | Cost (all at 1.25×) | Cost (5m/1h split) | Under-reported |
|:---|:---|:---|:---|
| claude-opus-4-6 | $2,388 | $2,826 | $438 (18%) |
| claude-haiku-4-5-20251001 | $112 | $135 | $23 (21%) |
| claude-sonnet-4-5-20250929 | $46 | $63 | $17 (38%) |
| **Total** | **$2,568** | **$3,047** | **$479 (19%)** |

## Suggested fix

When parsing JSONL records, check if `usage.cache_creation` is a dict containing `ephemeral_5m_input_tokens` and `ephemeral_1h_input_tokens`. If so, apply the correct rate for each:

- `ephemeral_5m_input_tokens` × 1.25× base input price
- `ephemeral_1h_input_tokens` × 2× base input price

Fall back to the current single-rate behavior when the breakdown is not available.

This is partly an upstream issue in LiteLLM's pricing data (`cache_creation_input_token_cost` only has one rate), but ccusage can handle the split independently since the token breakdown is already in the JSONL data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cache creation cost underestimated: 1-hour cache writes priced at 5m rate (1.25x) instead of 2x #899

Summary

Anthropic official pricing

Data available in JSONL

Current behavior in ccusage

Impact

Suggested fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cache operation	Multiplier	Duration
5-minute cache write	1.25× base input price	Cache valid for 5 minutes
1-hour cache write	2× base input price	Cache valid for 1 hour
Cache read (hit)	0.1× base input price	Same duration as the preceding write

Model	Base Input	5m Cache Writes	1h Cache Writes	Cache Hits
Claude Opus 4.6	$5/MTok	$6.25/MTok	$10/MTok	$0.50/MTok
Claude Sonnet 4.6	$3/MTok	$3.75/MTok	$6/MTok	$0.30/MTok
Claude Sonnet 4.5	$3/MTok	$3.75/MTok	$6/MTok	$0.30/MTok
Claude Haiku 4.5	$1/MTok	$1.25/MTok	$2/MTok	$0.10/MTok

Model	5m tokens	1h tokens	% 1h
claude-opus-4-6	9.6M	116.5M	92%
claude-sonnet-4-5-20250929	0	7.8M	100%
claude-sonnet-4-6	2.1M	0	0%
claude-haiku-4-5-20251001	15.8M	30.9M	66%

Model	Cost (all at 1.25×)	Cost (5m/1h split)	Under-reported
claude-opus-4-6	$2,388	$2,826	$438 (18%)
claude-haiku-4-5-20251001	$112	$135	$23 (21%)
claude-sonnet-4-5-20250929	$46	$63	$17 (38%)
Total	$2,568	$3,047	$479 (19%)

Uh oh!

Cache creation cost underestimated: 1-hour cache writes priced at 5m rate (1.25x) instead of 2x #899

Description

Summary

Anthropic official pricing

Data available in JSONL

Current behavior in ccusage

Impact

Suggested fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions