Skip to content

feat(usage): add cached tokens and cache hit % metrics#1900

Open
Zireael wants to merge 1 commit into
decolua:masterfrom
Zireael:feat/usage-cache-metrics
Open

feat(usage): add cached tokens and cache hit % metrics#1900
Zireael wants to merge 1 commit into
decolua:masterfrom
Zireael:feat/usage-cache-metrics

Conversation

@Zireael

@Zireael Zireael commented Jun 18, 2026

Copy link
Copy Markdown

Summary

Surface cache token data that was already captured from provider responses but not displayed in the Usage dashboard.

Changes

Backend ()

  • Add to , , and all aggregation loops (, , , , )
  • Add and to output
  • Add , , and to buckets (today, 24h, 7d/30d/60d)

Frontend

OverviewCards — Add two new summary cards: Cached Tokens and Cache Hit %

UsageChart — Expand from 2 view modes (Tokens/Cost) to 4:

  • Tokens: Total tokens (unchanged)
  • Cached: Total tokens + cached tokens overlaid for comparison
  • Cache %: Cache hit ratio over time
  • Add filter dropdown: All, By Model, By Account, By API Key, By Endpoint

UsageTable — In Tokens mode, add two new columns: Cached Tokens and Cache Hit %

RequestDetailsTab — Add cache columns to both the table and the detail drawer

UsageStats orchestrator — Update and to propagate cache data to all child components

Files Changed

  • — Backend aggregation
  • — Orchestrator data flow
  • — Summary cards
  • — Chart view modes
  • — Table columns
  • — Details tab

Surface cache token data that was already captured but not displayed:

Backend (usageRepo.js):
- Add cacheReadTokens to addToCounter, aggregateEntryToDay, all
  byProvider/byModel/byAccount/byApiKey/byEndpoint aggregations
- Add totalCacheReadTokens and cacheHitRatio to getUsageStats output
- Add cachedTokens, promptTokens, cacheHitRatio to getChartData buckets

Frontend:
- OverviewCards: add Cached Tokens and Cache Hit % summary cards
- UsageChart: add Cached and Cache % view modes with filter dropdown
- UsageTable: add Cached Tokens and Cache Hit % columns in tokens mode
- RequestDetailsTab: add cache columns to table and detail drawer
- UsageStats orchestrator: propagate cache data through sortData and
  groupDataByKey to all child components
bloodf pushed a commit to bloodf/9router that referenced this pull request Jun 19, 2026
Per-model fallback (new feature):
- open-sse/services/modelFallback.js: runWithModelFallback, getModelFallback,
  isDeterministicPayloadError (moved out of chat.js). One hop only, no chaining.
  Self-fallback no-op. Skips 2xx (streaming-safe) and deterministic payload
  errors (context-length / too-many-tokens); every other non-2xx is eligible.
- Wired into all 7 single-model handlers (chat, fetch, search, imageGeneration,
  tts, stt, embeddings) at the direct single-model dispatch ONLY. Combos keep
  their existing fallback semantics (they call handleSingleModel* directly,
  bypassing per-model fallback by design).
- stt.js + embeddings.js: extracted inline credential loops into module-scope
  handleSingleModel* helpers so the wrapper can re-enter them cleanly.
- settings.modelFallbacks default added to settingsRepo.js.
- src/app/api/model-fallbacks/route.js: GET + PATCH (whole-map replace).
- src/app/(dashboard)/dashboard/model-fallbacks/page.js: dashboard page with
  ModelSelectModal-backed primary/fallback pickers, enable toggle, remove.
- Sidebar.js + Header.js: nav entry + page-info branch.
- tests/unit/modelFallback.test.js: 16 cases, all green.

Upstream PR decolua#1900 — Usage: cached-token tracking:
- OverviewCards: Cached Tokens + Cache Hit % cards.
- UsageChart: Cached + Cache % view modes, filter dropdown
  (All/Model/Account/ApiKey/Endpoint).
- UsageTable + RequestDetailsTab: Cached Tokens + Cache Hit % columns.
- usageRepo.js: cacheReadTokens aggregation across all buckets
  (3 merge conflicts resolved: kept fork's normalized apiKey key + the
  ?? input/output token fallbacks, added theirs' cacheReadTokens field).

Upstream PR decolua#1902 — CommandCode: init body.params in transformRequest
  (ensure body.params exists before setting params.stream).

Docs: CHANGELOG (new v0.5.5 fork entry), README feature bullet,
open-sse/AGENTS.md (modelFallback service + combo-isolation pitfall).

Verified: tests/unit/modelFallback.test.js 16/16 pass. Full unit suite
failures are 100% pre-existing (all 32 match tests/__baseline__/known-fails.txt).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant