Skip to content

Remove misleading totalTokensPerSecond metric#11

Merged
YishenTu merged 1 commit into
mainfrom
remove-total-tps
Feb 24, 2026
Merged

Remove misleading totalTokensPerSecond metric#11
YishenTu merged 1 commit into
mainfrom
remove-total-tps

Conversation

@YishenTu
Copy link
Copy Markdown
Owner

What

Remove totalTokensPerSecond from stats calculation, type definitions, UI, and tests. Keep only outputTokensPerSecond.

Why

totalTokensPerSecond = totalTokens / requestDuration where totalTokens includes input tokens. Input tokens aren't "generated" during the request — they're prompt/prefill. This inflates the number and makes it semantically meaningless.

For example: 10k input + 200 output over 5s → Total TPS = 2040 tok/s, but actual generation speed is ~40 tok/s.

outputTokensPerSecond = outputTokens / (completedAt - firstStreamTokenAt) is the correct and useful throughput metric.

Changes

  • stats.ts: Remove totalTokensPerSecond calculation and requestDurationForRateMs
  • messages.ts: Remove field from AssistantResponseStats type
  • content-normalize.ts: Remove deserialization of the field
  • message-renderer.ts: Remove "Total TPS" row from stats panel
  • Tests: Remove all totalTokensPerSecond assertions

Output TPS (outputTokens / output window duration) is the meaningful
throughput metric. Total TPS included input tokens in the numerator,
producing misleading numbers — removed from stats calculation, types,
UI rendering, and all tests.
@YishenTu YishenTu merged commit 2833091 into main Feb 24, 2026
1 check passed
@YishenTu YishenTu deleted the remove-total-tps branch February 24, 2026 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant