Skip to content

feat(providers/google): Add reasoning token output support #6261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 11, 2025

Conversation

Und3rf10w
Copy link
Contributor

@Und3rf10w Und3rf10w commented May 10, 2025

Background

Vertex now supports extraction of thinking tokens in certain Gemini models.

When the configuration is passed via providerOptions, the sdk:

  1. Did not extract reasoning tokens
  2. Did not pass include_thoughts to the provider

Summary

Added extraction logic to google-generative-ai package to parse reasoning tokens.

Added a includeThoughts switch to the thinkingConfig for vertex models.

Verification

I verified it manually. Testable via examples/ai-core/src/stream-text/google-vertex-reasoning.ts. Easily copiable to google provider.

Tests have been added.

Tasks

  • Tests have been added / updated (for bug fixes / features) - c1264af
  • Examples have been updated
  • Documentation has been added / updated (for bug fixes / features)
  • A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • Formatting issues have been fixed (run pnpm prettier-fix in the project root)

Related Issues

Fixes #6259

chore(docs/providers): Update google provider documentation to add logic around reasoning token support
@Und3rf10w
Copy link
Contributor Author

Added example in 80eb2c3 to demonstrate:

image

@Und3rf10w Und3rf10w marked this pull request as ready for review May 10, 2025 04:34
Comment on lines 246 to 308
### Reasoning (Thinking Tokens)

Certain Google Gemini models support emitting "thinking" tokens, which represent the model's reasoning process before generating the final response. The AI SDK exposes these as reasoning information.

To enable thinking tokens, set `includeThoughts: true` in the `thinkingConfig` provider option:

```ts
import { google } from '@ai-sdk/google';
import { GoogleGenerativeAIProviderOptions } from '@ai-sdk/google';
import { generateText, streamText } from 'ai';

// For generateText:
const { text, reasoning, reasoningDetails } = await generateText({
model: google('gemini-2.5-flash-preview-04-17'), // Or other supported model
providerOptions: {
google: {
thinkingConfig: {
includeThoughts: true,
// thinkingBudget: 2048, // Optional
},
} satisfies GoogleGenerativeAIProviderOptions,
},
prompt: 'Explain quantum computing in simple terms.',
});

console.log('Reasoning:', reasoning);
console.log('Reasoning Details:', reasoningDetails);
console.log('Final Text:', text);

// For streamText:
const result = streamText({
model: google('gemini-2.5-flash-preview-04-17'), // Or other supported model
providerOptions: {
google: {
thinkingConfig: {
includeThoughts: true,
// thinkingBudget: 2048, // Optional
},
} satisfies GoogleGenerativeAIProviderOptions,
},
prompt: 'Explain quantum computing in simple terms.',
});

for await (const part of result.fullStream) {
if (part.type === 'reasoning') {
process.stdout.write(`THOUGHT: ${part.textDelta}\n`);
} else if (part.type === 'text-delta') {
process.stdout.write(part.textDelta);
}
}
```

When `includeThoughts` is true, parts of the API response marked with `thought: true` will be processed as reasoning.

- In `generateText`, these contribute to the `reasoning` (string) and `reasoningDetails` (array) fields.
- In `streamText`, these are emitted as `reasoning` stream parts.

<Note>
Refer to the [Google Generative AI
documentation](https://ai.google.dev/gemini-api/docs/thinking) for a list of
models that support thinking tokens and for more details on `thinkingBudget`.
</Note>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Und3rf10w have you tested this with the gemini api? couldn't find it in their docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgrammel, I have not directly, but I'm making the assumption it works for the gemini API based off of this Google Provided Notebook: https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Get_started_thinking.ipynb

I HAVE successfully tested this with the Vertex API.

Copy link
Contributor Author

@Und3rf10w Und3rf10w May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgrammel,

I tried a variation of https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Get_started_thinking.ipynb with a valid API key, and it turns out the GEMINI api doesn't yet OFFICIALLY support (read: documented) includeThoughts.

Looking at the proto definitions for the python client, include_thoughts isn't yet supported: (documented) https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.GenerationConfig.ThinkingConfig

Compare this to the vertex api documentation, where includeThoughts IS supported: https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1/GenerationConfig#ThinkingConfig

However, trying this in the notebook, we can see when we set a thinkingBudget and includeThoughts, the request IS valid, but it doesn't return thought candidates, despite using thinking tokens:

image image

So I suppose, in the Google Generative API, it WON'T provide the thoughts, but the request will still work likely it will eventually be supported. We should probably just remove the 15-google-generative-ai.mdx file edits for now?


Here's more photos playing around with the ThinkingConfig:

image image image

TL;DR: While the parameter include_thoughts works on the Google Generative AI platform, it doesn't currently return the thought tokens from the response. It does work as expected in Vertex AI. Likely way forward is to remove the edits to 15-google-generative-ai.mdx from this PR.

Maybe also:

  • Update changeset to be @ai-sdk/vertex instead of @ai-sdk/google
  • Throw a warning when includeThoughts is specified with an @ai-sdk/google provider for now? To be removed if/when that's specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with 6586ef2.

  • Now add a warning when includeThoughts is used with the google provider
  • added a test for above
  • removed includeThoughts addition from Google provider.
  • Added a generateText example.

… thinking config

chore(providers/google): Remove `includeThoughts` from google provider docs

chore(providers/google): Add unit test for reasoning warning when google provider is used with `includeThoughts`
@Und3rf10w Und3rf10w requested a review from lgrammel May 10, 2025 20:04
@lgrammel lgrammel merged commit fe24216 into vercel:main May 11, 2025
7 of 8 checks passed
@nileshtrivedi
Copy link

Thanks @Und3rf10w for implementing this. I'm surprised that Gemini APIs do not return thought tokens, because https://aistudio.google.com/ does support this in the UI as well as generated code - for all displayed programming languages:

image

@Und3rf10w
Copy link
Contributor Author

Thanks @Und3rf10w for implementing this. I'm surprised that Gemini APIs do not return thought tokens, because https://aistudio.google.com/ does support this in the UI as well as generated code - for all displayed programming languages:
image

@nileshtrivedi, I agree. I am sure it's supported but undocumented as of right now in the Gemini AI studio logic, and that the Vertex API is likely also how it's going to work.

thinkingBudget already worked in both Vertex and AI Studio before this PR, but only by passing includeThoughts in the request is what will actually trigger vertex to send thinking tokens, which I couldn't get to work in the AI Studio/Gemini API.

@nileshtrivedi
Copy link

I am seeing thinking tokens working only with 2.5-flash, not 2.5-pro (via Vertex). I get this error message:

[Error [AI_APICallError]: Unable to submit request because thinking is not configurable in this model; please remove the thinking_config setting and try it again. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini]

@Und3rf10w
Copy link
Contributor Author

Und3rf10w commented May 18, 2025

I am seeing thinking tokens working only with 2.5-flash, not 2.5-pro (via Vertex). I get this error message:

[Error [AI_APICallError]: Unable to submit request because thinking is not configurable in this model; please remove the thinking_config setting and try it again. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini]

I can confirm, I am experiencing this now, but it did work.

The vertex documentation still states support for the 2.5 pro models, so maybe it's a new model version they deployed or something?

ap-inflection added a commit to inflectionxyz/ai that referenced this pull request May 22, 2025
Ports the changes from this PR: vercel#6261 to the v5 branch.

Adds support for Gemini's thinking messages and the includeThoughts flag in the google and vertex configs.
@vlrevolution
Copy link

How to disable thinking on vertex provider altogether for 'gemini-2.5-pro-preview', I'm getting this error:

[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] [GenerateText] Non-fatal error reported by AI SDK streamText.onError:
[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] AI_APICallError: Unable to submit request because thinking is a default and constant feature of this model; To proceed, please remove the thinking_config.thinking_budget setting from your configuration and retry. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini

Oddly for 'gemini-2.5-flash-preview' it seems to work to disable it by setting the thinkingBudget to 0 (includeThoughts is not related to fully disabling it right, it's only about if thinking IS enabled to make the model send thinking tokens if my understanding is correct?).

Did they make thinking mandatory on 'gemini-2.5-pro-preview'? Is anybody else having difficulties? Such a wierd move if it is required, given they don't let us see the thinking. This is really keeping me back from using 2.5-pro for anything on the api as the thinking is a black box and is messing up a lot of prompt outputs because it gets confused by its own thinking lol...

@vlrevolution
Copy link

How to disable thinking on vertex provider altogether for 'gemini-2.5-pro-preview', I'm getting this error:

[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] [GenerateText] Non-fatal error reported by AI SDK streamText.onError:
[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] AI_APICallError: Unable to submit request because thinking is a default and constant feature of this model; To proceed, please remove the thinking_config.thinking_budget setting from your configuration and retry. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini

Oddly for 'gemini-2.5-flash-preview' it seems to work to disable it by setting the thinkingBudget to 0 (includeThoughts is not related to fully disabling it right, it's only about if thinking IS enabled to make the model send thinking tokens if my understanding is correct?).

Did they make thinking mandatory on 'gemini-2.5-pro-preview'? Is anybody else having difficulties? Such a wierd move if it is required, given they don't let us see the thinking. This is really keeping me back from using 2.5-pro for anything on the api as the thinking is a black box and is messing up a lot of prompt outputs because it gets confused by its own thinking lol...

I've created an issue in google's issue tracker, please comment if you are having issues too:
https://issuetracker.google.com/issues/420952680

@Und3rf10w
Copy link
Contributor Author

Und3rf10w commented May 29, 2025

How to disable thinking on vertex provider altogether for 'gemini-2.5-pro-preview', I'm getting this error:

[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] [GenerateText] Non-fatal error reported by AI SDK streamText.onError:
[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] AI_APICallError: Unable to submit request because thinking is a default and constant feature of this model; To proceed, please remove the thinking_config.thinking_budget setting from your configuration and retry. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini

Oddly for 'gemini-2.5-flash-preview' it seems to work to disable it by setting the thinkingBudget to 0 (includeThoughts is not related to fully disabling it right, it's only about if thinking IS enabled to make the model send thinking tokens if my understanding is correct?).

Did they make thinking mandatory on 'gemini-2.5-pro-preview'? Is anybody else having difficulties? Such a wierd move if it is required, given they don't let us see the thinking. This is really keeping me back from using 2.5-pro for anything on the api as the thinking is a black box and is messing up a lot of prompt outputs because it gets confused by its own thinking lol...

Did you try includeThoughts: false though? Or not including a thinkingConfig all for that model?

At one point of the 2.5 pro exp model, you were able to configure the thinking, but now that's only limited to 2.5 flash

@vlrevolution
Copy link

How to disable thinking on vertex provider altogether for 'gemini-2.5-pro-preview', I'm getting this error:

[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] [GenerateText] Non-fatal error reported by AI SDK streamText.onError:
[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] AI_APICallError: Unable to submit request because thinking is a default and constant feature of this model; To proceed, please remove the thinking_config.thinking_budget setting from your configuration and retry. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini

Oddly for 'gemini-2.5-flash-preview' it seems to work to disable it by setting the thinkingBudget to 0 (includeThoughts is not related to fully disabling it right, it's only about if thinking IS enabled to make the model send thinking tokens if my understanding is correct?).
Did they make thinking mandatory on 'gemini-2.5-pro-preview'? Is anybody else having difficulties? Such a wierd move if it is required, given they don't let us see the thinking. This is really keeping me back from using 2.5-pro for anything on the api as the thinking is a black box and is messing up a lot of prompt outputs because it gets confused by its own thinking lol...

Did you try includeThoughts: false though? Or noti ncluding a thinkingConfig all to get for that model?

At one point of the 2.5 pro exp model, you were able to configure the thinking, but now that's only limited to 2.5 flash

Both or either result in the same error that thinking config is not allowed :(

@Adebesin-Cell
Copy link

How to disable thinking on vertex provider altogether for 'gemini-2.5-pro-preview', I'm getting this error:

[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] [GenerateText] Non-fatal error reported by AI SDK streamText.onError:
[Nest] 298  - 05/29/2025, 10:53:54 AM   ERROR [GenerateTextService] AI_APICallError: Unable to submit request because thinking is a default and constant feature of this model; To proceed, please remove the thinking_config.thinking_budget setting from your configuration and retry. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini

Oddly for 'gemini-2.5-flash-preview' it seems to work to disable it by setting the thinkingBudget to 0 (includeThoughts is not related to fully disabling it right, it's only about if thinking IS enabled to make the model send thinking tokens if my understanding is correct?).
Did they make thinking mandatory on 'gemini-2.5-pro-preview'? Is anybody else having difficulties? Such a wierd move if it is required, given they don't let us see the thinking. This is really keeping me back from using 2.5-pro for anything on the api as the thinking is a black box and is messing up a lot of prompt outputs because it gets confused by its own thinking lol...

I've created an issue in google's issue tracker, please comment if you are having issues too: issuetracker.google.com/issues/420952680

I'm using gemini-2.5-flash-preview-05-20, but every time the response includes reasoning or thoughts, even though I've clearly asked in the prompt not to include them. I also set thinkingBudget: 0 and includeThoughts: false, but it didn't help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Google-Vertex: Support include_thinking in reasoning configuration and extraction of model thoughts.
5 participants