Describe the solution you'd like
Many API based AI services have a feature itself, a plugin, or a 3rd-party tool to show information to track the API usages.
Since Ailoy provides the feature to use AI APIs, it is essential to provide information about their usage.
- tokens upward / downward
- estimated API costs
- (anything else that is provided by API)
And the other statistics info can help people to know how they are using the AI more even for local or API models.
- latency
- first token
- whole response(last token)
- tok/s
(This is a little out of the context.)
For the local models, providing GPU computing/memory utilization or amount of memory is being consumed can be helpful.
Additional context
Here are some code repos of the examples that provide that kind of informations, so maybe these can help to implement this.
Describe the solution you'd like
Many API based AI services have a feature itself, a plugin, or a 3rd-party tool to show information to track the API usages.
Since Ailoy provides the feature to use AI APIs, it is essential to provide information about their usage.
And the other statistics info can help people to know how they are using the AI more even for local or API models.
(This is a little out of the context.)
For the local models, providing GPU computing/memory utilization or amount of memory is being consumed can be helpful.
Additional context
Here are some code repos of the examples that provide that kind of informations, so maybe these can help to implement this.