-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Description
It would be great if we could utilize batch mode in LLM providers such as Vertex AI, Open AI and Claude.
Working with LLMs at a production level can mean a lot of data constantly which results in a big cost factor.
Batch mode from Vertex AI and Open AI both suggest a 50% cost reduction when using batch mode. Right now, we would have to switch our services and use the native SDK of the provider so that we can use batch mode.