Skip to content

Commit

Permalink
Explaining billed tokens and why they're different.
Browse files Browse the repository at this point in the history
  • Loading branch information
Trent Fowler authored and Trent Fowler committed Feb 5, 2025
1 parent bb86604 commit efdac31
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions fern/pages/going-to-production/how-does-cohere-pricing-work.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,25 @@ Our Rerank models are priced based on the quantity of searches, and our Embeddin

You can find up-to-date prices on our [dedicated pricing page](https://cohere.com/pricing).

### What's the Difference Between "billed" Tokens and Generic Tokens?

In certain workflows you'll see an output like this:

```json JSON
{
"billed_units": {
"input_tokens": 6772,
"output_tokens": 248
},
"tokens": {
"input_tokens": 7596,
"output_tokens": 645
}
}
```

And it may not be obvious why there are separate input and output values under `billed_units`. As its name suggests, the _billed_ input and output tokens are the tokens that you're actually _billed_ for. The reason these values can be different from the overall `"tokens"` value is that there are situations in which Cohere adds tokens under the hood, and there are others in which a particular model has been trained to do so (i.e. when outputting special tokens). Since these are tokens *you don't have control over, you are not charged for them.*

## Trial Usage and Production Usage

Cohere makes a distinction between "trial" and "production" usage of an API key.
Expand Down

0 comments on commit efdac31

Please sign in to comment.