Skip to content

fix(llm/messages): normalize Claude Code in-conversation system messages#2015

Open
yalindogusahin wants to merge 1 commit into
agentgateway:mainfrom
yalindogusahin:fix/anthropic-system-in-messages
Open

fix(llm/messages): normalize Claude Code in-conversation system messages#2015
yalindogusahin wants to merge 1 commit into
agentgateway:mainfrom
yalindogusahin:fix/anthropic-system-in-messages

Conversation

@yalindogusahin
Copy link
Copy Markdown

@yalindogusahin yalindogusahin commented May 30, 2026

Summary

Claude Code 2.1.157+ injects in-conversation system reminders inside the Anthropic messages array (messages[*].role == "system"), which is not part of the Anthropic Messages API spec — system is a top-level field, and messages[].role is limited to user/assistant. Strict spec implementations like vLLM's Anthropic endpoint reject the payload before request conversion (see vllm-project/vllm#44048), and the same payload also fails agentgateway's typed Role enum during messages → completions / bedrock translation.

This PR normalizes the request at the proxy entry point so every downstream conversion path sees a canonical Anthropic shape:

  • drain messages[*].role == "system" entries from messages
  • append their text content to the top-level system field, preserving any existing top-level system content first
  • keep unknown roles rejected by the existing strict typed conversion

Changes

  • crates/agentgateway/src/llm/types/messages.rs: add Request::normalize_system_messages() method handling both string and array ContentBlock forms; merges extracted content into a TextBlock::Array after existing top-level system. 6 new unit tests cover extraction, preservation order, array content blocks, no-op paths, and a regression test that exercises the actual completions::from_messages::translate call with a Claude Code 2.1.157+ shaped payload.
  • crates/agentgateway/src/llm/mod.rs: call normalize_system_messages() inside LLM::process_messages_request after the request body is parsed and before process_request dispatch. Single chokepoint covers OpenAI, Bedrock, Vertex, and Anthropic-native passthrough.
  • crates/agentgateway/src/llm/conversion/completions.rs, crates/agentgateway/src/llm/conversion/bedrock.rs, and RequestType::to_anthropic in messages.rs: defensive normalize_system_messages() calls in the direct library entry points so callers that bypass the proxy entry point (tests, embedders) get the same compatibility.

Trade-off (out of scope here)

The merge loses positional information: every in-conversation system block is appended after existing top-level system content. Preserving position would require provider-aware translation — OpenAI chat completions natively supports multiple system messages mid-conversation, Bedrock could use system blocks, Anthropic native cannot. A future opt-in flag (e.g. preserve_in_conversation_system) could expose this for backends that benefit from it. For now, normalizing into the standard Anthropic system field matches the same approach vLLM took in vllm-project/vllm#44048 and gives the least surprising baseline behavior.

Test Plan

  • cargo build -p agentgateway (passes locally)
  • cargo test -p agentgateway llm::types::messages::normalize_system_messages_tests6/6 passed locally
  • CI on this PR

@yalindogusahin yalindogusahin requested a review from a team as a code owner May 30, 2026 23:21
…ges into top-level system

Signed-off-by: Yalın Şahin <yalinsahin1@gmail.com>
@yalindogusahin yalindogusahin force-pushed the fix/anthropic-system-in-messages branch from 2a8c517 to 53e2538 Compare May 30, 2026 23:33
@Syraxius
Copy link
Copy Markdown

Yes actually I've tried two things before I decided to go with the merging into system (sacrificing the positional part).

  1. Inject system messages wherever it was
  2. Change the system messages into user messages

Both ways completely messed up the models behavior (I was testing with Qwen series) so I decided to go with the the merging.

I've inspected Claude's midway system messages - they're mainly to inject skills information, which is completely fine to be in system for now.

@yalindogusahin
Copy link
Copy Markdown
Author

Thanks @Syraxius. Anything you'd change in this PR? Naming, normalization placement?

@howardjohn
Copy link
Copy Markdown
Collaborator

I am not 100% convinced this is the right path. The messages API is whatever api.anthropic.com/v1/messages accepts. If anthropic docs are wrong, thats a bummer but the reality is anthropic owns the messages API and can do whatever they want.

This seems more like something for vllm to fix, as they are no longer implementing Anthropic Messages API but "Something almost like anthropic messages API".

I could be way off on this, didn't get a chance to research it much

@yalindogusahin
Copy link
Copy Markdown
Author

I ran Claude Code 2.1.158 through a logging proxy in front of api.anthropic.com to see what the real API does. The live /v1/messages requests send anthropic-beta: mid-conversation-system-2026-04-07, and the API accepts role: "system" entries inside the messages array under that beta. So the shape isn't Claude Code going off-spec, it's something Anthropic supports behind a beta flag.

That said, I'm honestly not 100% sure whether vLLM or agentgateway is the right place to handle it either. The case for vLLM accepting it is real, since api.anthropic.com does. The other side is that agentgateway's own typed conversion (messages -> openai) also trips on this before any backend is involved. Curious what you think with that context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants