Capture number of tokens in a request and response when possible #373

jwmatthews · 2024-09-17T19:34:59Z

We've run into a few situations where it would benefit us if we had a better view of the number of tokens consumed in a request and response.

Let's augment the data we are capturing for tracing and add in any extra info we may get back from the LLM via 'response_metadata'.

Current understanding is that for some models, the response includes metadata that breaks out the number of tokens used in the request and the response.

jwmatthews · 2024-09-17T19:39:21Z

@devjpt23 has begun to work on this issue. I wasn't yet able to formally assign him to this issue.

Looks like I can only assign issues to folks in the Konveyor Org, so formed a new team of 'Collaborators' and invited @devjpt23 to that so he can be assigned future issues.

jwmatthews · 2024-09-20T19:23:26Z

#375 adds the ability to log token request/response usage on successful calls for some models that send back a 'token_usage' in response metadata.

We would like to extend the capability beyond what #375 offers.

Pre-compute a guess at the tokens consumed in a prompt, prior to sending and log it.
On a failure response, check if there is any response metadata on token usage we can find
Explore other providers that are not showing token usage as per Add metadata to tracing #375, Amazon Bedrock is one provider which didn't log token data as per Add metadata to tracing #375

JonahSussman assigned devjpt23 Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture number of tokens in a request and response when possible #373

Capture number of tokens in a request and response when possible #373

jwmatthews commented Sep 17, 2024

jwmatthews commented Sep 17, 2024

jwmatthews commented Sep 20, 2024

Capture number of tokens in a request and response when possible #373

Capture number of tokens in a request and response when possible #373

Comments

jwmatthews commented Sep 17, 2024

jwmatthews commented Sep 17, 2024

jwmatthews commented Sep 20, 2024