-
Notifications
You must be signed in to change notification settings - Fork 432
[Inference Providers] provider="auto"
#1390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -94,3 +112,27 @@ export async function getInferenceProviderMapping( | |||
} | |||
return null; | |||
} | |||
|
|||
export async function resolveProvider( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be done in getProviderHelper
as we did for the python client, but that would make the function async and we would have to update the snippets generation as well
} | ||
inferenceProviderMapping = await resp | ||
.json() | ||
.then((json) => json.inferenceProviderMapping) | ||
.catch(() => null); | ||
|
||
if (inferenceProviderMapping) { | ||
inferenceProviderMappingCache.set(modelId, inferenceProviderMapping); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if provider="auto"
, we call fetchInferenceProviderMappingForModel
2 times: once to resolve the provider and a second time in makeRequestOptions
. this cache avoids the extra HTTP call.
…face.js into auto-select-provider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from a question about the type name, lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clean!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
Are we sure we're not keeping the name Will keep diff smaller as @SBrandeis mentions too |
…face.js into auto-select-provider
Discussed internally, let's keep |
Same as huggingface/huggingface_hub#3011. This PR adds support for auto selection of the provider. Previously the default value was `hf-inference` (HF Inference API provider), now we default to "auto", meaning we will select the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. you can test with: ```ts import { chatCompletion } from "../src"; const res = await chatCompletion({ // provider="auto", model: "deepseek-ai/DeepSeek-V3-0324", messages: [ { role: "user", content: "What is the capital of France?", }, ], accessToken: process.env.HF_TOKEN, }); console.log(res.choices[0].message.content); ``` ``` Defaulting to 'auto' which will select the first provider available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. Auto-selected provider: sambanova The capital of France is **Paris**. It is known for its iconic landmarks such as the Eiffel Tower...blabla ``` the selected provider should be be the first in `inferenceProviderMapping` mapping here: https://huggingface.co/api/models/deepseek-ai/DeepSeek-V3-0324?expand=inferenceProviderMapping
Same as huggingface/huggingface_hub#3011.
This PR adds support for auto selection of the provider. Previously the default value was
hf-inference
(HF Inference API provider), now we default to "auto", meaning we will select the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers.you can test with:
the selected provider should be be the first in
inferenceProviderMapping
mapping here: https://huggingface.co/api/models/deepseek-ai/DeepSeek-V3-0324?expand=inferenceProviderMapping