feat(llm): use Gemini native generateContent/streamGenerateContent endpoints on Vertex by htimur · Pull Request #2023 · agentgateway/agentgateway

htimur · 2026-06-01T14:42:10Z

Fixes #1929.

Summary

This change adds a native Gemini api usage to the Vertex provider. Gemini models are now routed directly to :generateContent and :streamGenerateContent, with request and response translation handled between the OpenAI chat-completions format and Gemini's native API.

No changes for Anthropic models on Vertex were implemented, and usage of OpenAI compatible endpoint remains as it was before.

Worth mentioning / notes

Added a new CEL variable and access-log field, llm.upstreamFinishReason, which exposes the original Gemini finish reason before it is mapped to an OpenAI finish reason. Its implemented only for the Vertex Gemini models, and I believe this is important for observability/monitoring.
Image urls: data: urls are converted to inline data and gs:// urls to fileData with a resolved MIME type. http(s) urls are rejected because Vertex cannot fetch remote images directly.

What changed

Routing: Vertex gemini-* models on the completions route now use :generateContent and :streamGenerateContent for streaming. Gemini embedding models (for example, gemini-embedding-001) continue to use the embeddings endpoint, and Anthropic-on-Vertex behavior is unchanged.
Request translation: OpenAI chat-completions requests are translated to Gemini's native format. Messages are mapped to contents (including system instructions, tool calls, and tool responses), tools to functionDeclarations, tool_choice to functionCallingConfig, and sampling, structured output, and reasoning settings to generationConfig (including thinkingConfig). cachedContent, labels, and safetySettings are passed through unchanged.
Response translation: Gemini responses and streaming events are translated back into the OpenAI chat-completions format, including finish reasons, tool calls, reasoning content, and token usage.

What was tested

Manual tests and evals were executed against Vertex Gemini models on a locally built instance of the gateway alongside the test suite and google ADK based tests, and no behaviour changes were detected.

Use of AI assistance

The native Gemini wire types and parts of the test suite were developed with the help of an LLM. I have reviewed the generated code, verified my understanding of its behaviour and taken ownership of the implementation. I reviewed the tests for correctness and coverage, and validated the change to confirm that it behaves as expected.

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

…ructs Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

…loop roles Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

htimur requested a review from a team as a code owner June 1, 2026 14:42

htimur added 7 commits June 1, 2026 16:48

feat(llm): route gemini-* on Vertex to native generateContent

f1f342a

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

feat(llm): stream gemini-* on Vertex via native streamGenerateContent

cb46ebb

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

refactor(llm): resolve embedding title once, drop redundant mut

c02883c

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

refactor(llm): build native Gemini completions response from typed st…

563c172

…ructs Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

feat(llm): add llm.upstreamFinishReason CEL and log field

eafd3a9

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

test(llm): assert native Gemini CEL usage fields and multi-turn tool-…

5bf3038

…loop roles Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

fix(llm): harden native Gemini streaming and translation

55e4337

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

htimur force-pushed the feat/vertex-native-gemini branch from cbf76a7 to 55e4337 Compare June 1, 2026 14:48

htimur marked this pull request as draft June 1, 2026 15:20

fix(llm): normalize JSON Schema to Gemini's responseSchema subset

2eb0bd5

Signed-off-by: Timur Khamrakulov <timur.khamrakulov@gmail.com>

htimur marked this pull request as ready for review June 2, 2026 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): use Gemini native generateContent/streamGenerateContent endpoints on Vertex#2023

feat(llm): use Gemini native generateContent/streamGenerateContent endpoints on Vertex#2023
htimur wants to merge 8 commits into
agentgateway:mainfrom
htimur:feat/vertex-native-gemini

htimur commented Jun 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

htimur commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Worth mentioning / notes

What changed

What was tested

Use of AI assistance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

htimur commented Jun 1, 2026 •

edited

Loading