feat(messages): add native Anthropic Messages API (/v1/messages) by cdoern · Pull Request #5386 · llamastack/llama-stack

cdoern · 2026-03-30T19:16:13Z

Summary

Adds a native /v1/messages endpoint implementing the Anthropic Messages API, enabling llama-stack to serve as a drop-in backend for Claude Code, Codex CLI, and other Anthropic-protocol clients
Follows the same architecture as the Responses API: a single inline::builtin provider (BuiltinMessagesImpl) that depends on Api.inference and works with all inference backends automatically
For providers that natively support /v1/messages (e.g. Ollama), requests are forwarded directly without translation, preserving full fidelity (thinking blocks, native streaming, etc.)
For all other providers, translates Anthropic Messages format to/from OpenAI Chat Completions format transparently

What's included

API layer (src/llama_stack_api/messages/): Protocol, Pydantic models for all Anthropic types (content blocks, streaming events, tool use, thinking), FastAPI routes with Anthropic-specific named SSE events
Provider implementation (src/llama_stack/providers/inline/messages/): Translation layer (request/response/streaming) + native passthrough for Ollama
Distribution configs: Enabled in starter and ci-tests distributions
Tests: 17 unit tests covering request translation, response translation, and streaming translation
Generated artifacts: OpenAPI specs, provider docs, Stainless SDK config

Translation map

Anthropic	OpenAI	Notes
`system` (top-level)	`messages[0]` role=system	Moved to first message
Content blocks (text, image)	String or content parts	Restructured
`tool_use` block	`tool_calls` on assistant msg	Different structure
`tool_result` block	`role: "tool"` message	Different message type
`tool_choice: "any"`	`tool_choice: "required"`	Renamed
`stop_sequences`	`stop`	Renamed
`stop_reason: "end_turn"`	`finish_reason: "stop"`	Mapped
`stop_reason: "tool_use"`	`finish_reason: "tool_calls"`	Mapped
Streaming content blocks	Streaming deltas	Full event sequence

Test plan

uv run pytest tests/unit/providers/inline/messages/ -x --tb=short -v (17/17 passing)
uv run pre-commit run mypy --all-files (passes)
Manual end-to-end test: non-streaming via Ollama (translation path)
Manual end-to-end test: non-streaming via Ollama (native passthrough)
Manual end-to-end test: streaming via Ollama (native passthrough, including thinking blocks)
Integration tests with recording/replay

Generated with Claude Code

github-actions · 2026-03-30T19:17:02Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

feat(messages): add native Anthropic Messages API (/v1/messages)

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

New diagnostics (2 note)

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages`

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages/count_tokens`

⚠️

llama-stack-client-go studio · conflict

Your SDK build had at least one new warning diagnostic, which is a regression from the base state.

New diagnostics (2 warning)

⚠️ Endpoint/NotConfigured: `post /v1/messages` exists in the OpenAPI spec, but isn't specified in the Stainless config, so code will not be generated for it.

⚠️ Endpoint/NotConfigured: `post /v1/messages/count_tokens` exists in the OpenAPI spec, but isn't specified in the Stainless config, so code will not be generated for it.

✅ llama-stack-client-python studio · conflict

Your SDK build had at least one new note diagnostic, which is a regression from the base state.

New diagnostics (2 note)

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages`

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages/count_tokens`

✅ llama-stack-client-node studio · conflict

Your SDK build had at least one new note diagnostic, which is a regression from the base state.

New diagnostics (2 note)

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages`

💡 Endpoint/NotConfigured: Skipped endpoint because it's not in your Stainless config: `post /v1/messages/count_tokens`

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-04-01 15:46:43 UTC

cdoern · 2026-03-31T01:29:09Z

going to add integration tests here too since ollama is compatible

mergify · 2026-04-01T14:48:27Z

This pull request has merge conflicts that must be resolved before it can be merged. @cdoern please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Add the API layer for the Anthropic Messages API (/v1/messages). This includes the Messages protocol definition, Pydantic models for all Anthropic request/response types (content blocks, streaming events, tool use, thinking), and FastAPI routes with Anthropic-specific SSE streaming format. Also registers the "messages" logging category and adds Api.messages to the Api enum. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

…ive passthrough Add the single BuiltinMessagesImpl provider that translates Anthropic Messages format to/from OpenAI Chat Completions, delegating to the inference API. For providers that natively support /v1/messages (e.g. Ollama), requests are forwarded directly without translation. Also registers the provider in the registry, wires the router in the server, and adds Messages to the protocol map in the resolver. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

…ions Add the messages provider (inline::builtin) to the starter distribution template and regenerate configs for starter and ci-tests distributions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

Add 17 unit tests covering request translation, response translation, and streaming translation. Regenerate OpenAPI specs, provider docs, and Stainless SDK config to include the new /v1/messages endpoints. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

Add a new messages integration test suite that exercises the Anthropic Messages API (/v1/messages) end-to-end through the server. The suite includes 13 tests covering non-streaming, streaming, system prompts, multi-turn conversations, tool definitions, tool use round trips, content block arrays, error handling, and response headers. To enable replay mode (no live backend required), extend the api_recorder to patch httpx.AsyncClient.post and httpx.AsyncClient.stream. This captures the native Ollama passthrough requests the Messages provider makes via raw httpx, following the same pattern used for aiohttp rerank recording. Recordings are stored in tests/integration/messages/recordings/. Also fix pre-commit violations: structured logging in impl.py, unused loop variable, and remove redundant @pytest.mark.asyncio decorators from unit tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

…pruned The cleanup_recordings.py script uses ci_matrix.json to determine which test suites are active. Without the messages suite listed, the script considers all messages recordings unused and deletes them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Doern <cdoern@redhat.com>

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 30, 2026

cdoern force-pushed the messages-api branch from a956223 to f991cee Compare March 30, 2026 19:17

mergify bot added the needs-rebase label Apr 1, 2026

cdoern force-pushed the messages-api branch from 5dbbf4b to 0ed61bd Compare April 1, 2026 15:21

cdoern added this to the 1.0.0 milestone Apr 1, 2026

mergify bot removed the needs-rebase label Apr 1, 2026

cdoern and others added 6 commits April 1, 2026 11:44

cdoern force-pushed the messages-api branch from 0ed61bd to c36445d Compare April 1, 2026 15:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(messages): add native Anthropic Messages API (/v1/messages)#5386

feat(messages): add native Anthropic Messages API (/v1/messages)#5386
cdoern wants to merge 6 commits intollamastack:mainfrom
cdoern:messages-api

cdoern commented Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

cdoern commented Mar 31, 2026

Uh oh!

mergify bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cdoern commented Mar 30, 2026

Summary

What's included

Translation map

Test plan

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

cdoern commented Mar 31, 2026

Uh oh!

mergify bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 30, 2026 •

edited

Loading