Skip to content

feat: add forward_headers passthrough to remote::model-context-protocol#5257

Open
skamenan7 wants to merge 1 commit intollamastack:mainfrom
skamenan7:feat/5152-mcp-tool-passthrough
Open

feat: add forward_headers passthrough to remote::model-context-protocol#5257
skamenan7 wants to merge 1 commit intollamastack:mainfrom
skamenan7:feat/5152-mcp-tool-passthrough

Conversation

@skamenan7
Copy link
Copy Markdown
Contributor

@skamenan7 skamenan7 commented Mar 23, 2026

Adds forward_headers and extra_blocked_headers to MCPProviderConfig, wiring per-request header forwarding into list_runtime_tools and invoke_tool. This lets deployers map keys from X-LlamaStack-Provider-Data to outbound HTTP headers so request-scoped auth tokens (MaaS API keys, tenant IDs, etc.) reach the downstream MCP server without the caller passing them via authorization= on every tool call.

Follows the same forward_headers pattern introduced for inference and safety passthrough in #5134. Authorization-mapped values are split out and passed via the authorization= param — prepare_mcp_headers() rejects Authorization in the headers dict directly, so it flows through the dedicated param instead.

What changed

  • MCPProviderConfig: added forward_headers: dict[str, str] | None and extra_blocked_headers: list[str] with config-time validation via validate_forward_headers_config() from providers/utils/forward_headers.py
  • MCPProviderDataValidator: added model_config = ConfigDict(extra="allow") so deployer-defined keys survive Pydantic parsing (key names are operator-configured at deploy time and can't be declared as typed fields)
  • ModelContextProtocolImpl: new _get_forwarded_headers_and_auth() reads the allowlist from provider data, splits Authorization for the authorization= param, returns non-auth headers separately. Both list_runtime_tools and invoke_tool merge forwarded headers with the legacy mcp_headers URI-keyed path (kept for backward compat). Explicit authorization= from the caller wins over forwarded auth.

Config example:

providers:
  tool_runtime:
  - provider_type: remote::model-context-protocol
    config:
      forward_headers:
        maas_api_token: Authorization   # bare token, "Bearer " prepended by prepare_mcp_headers
        tenant_id: X-Tenant-ID
        team_id: X-Team-ID

Test plan

Unit/integration tests — tests covering config validation, header forwarding, auth splitting, default-deny enforcement, missing-key soft-skip, and wiring through list_runtime_tools and invoke_tool:

uv run pytest tests/integration/tool_runtime/test_passthrough_mcp.py -v

Local testing with mock MCP server — ran a mock MCP server that captures all inbound headers and exposes them at /debug/last-headers. Started llama-stack with the forward_headers config above, then verified via the agent API that:

  • maas_api_token from X-LlamaStack-Provider-Data arrived as Authorization: Bearer <token> on the downstream server
  • tenant_id arrived as X-Tenant-ID
  • Unlisted keys (secret_internal) did not appear in any downstream header (default-deny confirmed)
  • Requests with missing or no provider data worked without crashing

Note: invoke_tool and list_runtime_tools are internal methods called by the agents layer and not exposed as HTTP endpoints (changed in upstream PRs #4997 and #5246), so full e2e requires a model server via the agent API.

Checklist

  • forward_headers / extra_blocked_headers optional with backward-compatible defaults — configs without them parse correctly
  • Default-deny: unlisted keys are silently dropped, nothing escapes the allowlist
  • Three-state config tested: None, empty dict, populated dict
  • MCPProviderDataValidator uses extra="allow" (intentional — deployer key names can't be pre-declared)
  • MCPProviderConfig uses extra="forbid"
  • Reuses shared build_forwarded_headers() and validate_forward_headers_config() from providers/utils/forward_headers.py — no duplication
  • Existing mcp_headers URI-keyed path preserved for backward compat
  • No breaking changes

Summary by Sourcery

Add configurable per-request header forwarding support to remote passthrough providers, including MCP tool runtime, inference, and safety, using a shared forward_headers utility with stricter validation and default-deny behavior.

New Features:

  • Allow MCP remote tool runtime to forward selected request-scoped headers and auth tokens from provider data to downstream MCP servers via forward_headers configuration.
  • Enable inference and safety passthrough providers to forward whitelisted headers from X-LlamaStack-Provider-Data using a common forward_headers policy and extra_blocked_headers for operator overrides.

Enhancements:

  • Refine passthrough inference auth handling so the OpenAI client relies solely on composed request headers, avoiding sentinel API keys and ensuring static credentials override forwarded Authorization values.
  • Harden header forwarding with centralized validation, case-insensitive blocking of security-sensitive and operator-defined headers, value sanitization, and size limits.
  • Relax provider-data validators for passthrough and MCP so deployer-defined keys are preserved while keeping provider configs strict via extra='forbid'.
  • Document forward_headers and extra_blocked_headers options for MCP tool runtime, inference passthrough, and safety passthrough providers.

Tests:

  • Add comprehensive unit tests for the shared forward_headers utilities, passthrough inference adapter behavior, safety passthrough config and headers, and MCP provider config and wiring through list_runtime_tools and invoke_tool.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 23, 2026
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from bfbfeb5 to a07a34e Compare March 24, 2026 11:33
@skamenan7 skamenan7 changed the title add forward_headers passthrough to remote::model-context-protocol feat: add forward_headers passthrough to remote::model-context-protocol Mar 24, 2026
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch 2 times, most recently from 9410037 to c430e0d Compare March 24, 2026 19:14
@skamenan7 skamenan7 marked this pull request as ready for review March 24, 2026 19:55
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch 16 times, most recently from 8a88160 to f349f68 Compare March 26, 2026 16:16
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch 15 times, most recently from e28014e to 28184fe Compare April 1, 2026 09:46
@mattf
Copy link
Copy Markdown
Collaborator

mattf commented Apr 1, 2026

This lets deployers map keys from X-LlamaStack-Provider-Data to outbound HTTP headers so request-scoped auth tokens (MaaS API keys, tenant IDs, etc.) reach the downstream MCP server without the caller passing them via authorization= on every tool call.

it's actually a security feature that users must always pass their token on each call.

plus, that isn't changed by this pr, only the number of ways the token can be passed is expanded.


stack has had a thorny problem for a long time: by design we don't hold user credentials, we (0) let admin configure credentials and we (1) let users pass through credentials (via headers); we recommend (1) over (0); and, the api spec dictates that we accept action requests that require credentials w/o a clear way for a user to provide them.

consider -

curl http://localhost:8321/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fancy-model",
    "tools": [{"type": "web_search"}],
    "input": "What is the latest news about fancy-model?"
  }'

web_search implementations require credentials to communicate with some external system.

today, the path for anyone using a stack w/o admin supplied credentials is to be more verbose -

curl http://localhost:8321/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fancy-model",
    "tools": [
      {
        "type": "mcp",
        "server_label": "tavily",
        "server_url": "https://mcp.tavily.com/mcp/?tavilyApiKey='$TAVILY_API_KEY'",
      }
    ],
    "input": "What is the latest news about fancy-model?"
  }'

or

curl http://localhost:8321/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fancy-model",
    "tools": [
      {
        "type": "mcp",
        "server_label": "my_mcp_server",
        "server_url": "https://my-mcp-server.example.com/mcp",
        "authorization": "'$MY_MCP_API_KEY'",
      }
    ],
    "input": "What is the latest news about fancy model?"
  }'

@skamenan7 is this the problem you're aiming to solve? if not, please expand on what your users are trying to do. please phrase is in the user's terms, e.g. "trying to use tool xyz on mcp server pqr via a /v1/responses".

@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch 6 times, most recently from d835e6e to ccaf562 Compare April 1, 2026 17:01
@mattf
Copy link
Copy Markdown
Collaborator

mattf commented Apr 1, 2026

this branch has been updated 24 times in the last 7 hours. i'm marking it as draft until it is ready for review.

@mattf mattf marked this pull request as draft April 1, 2026 17:28
@skamenan7
Copy link
Copy Markdown
Contributor Author

Thanks @mattf, yeah, the closest user-facing shape is your first example.

The web_search example isn't literally the same tool path as this PR, but the UX I'm aiming for is the same: the user wants to make a normal /v1/responses call for a stack-managed tool, while still passing their credential on that same request, without dropping down to the verbose inline type: "mcp" form and inlining the MCP server details in the request body.

In the case this PR is targeting, the backing implementation is a registered remote::model-context-protocol provider. The deployer-side config I had in mind is roughly:

tool_runtime:
  - provider_id: model-context-protocol
    provider_type: remote::model-context-protocol
    config:
      forward_headers:
        maas_token: Authorization
        tenant_id: X-Tenant-ID

In that setup, the user still passes maas_token and tenant_id on each request via X-LlamaStack-Provider-Data; the stack just maps those per-request values onto the downstream MCP header contract for the registered backend. So this doesn't change the security property that users pass creds on each call.

Today, if the downstream MCP server needs caller-scoped auth, the fallback is to switch to the verbose inline MCP form and pass server_url plus authorization directly. This PR is meant to preserve the per-request credential model for the registered MCP-provider path as well, without forcing the caller to inline the server definition and auth scheme in every /v1/responses request.

@skamenan7
Copy link
Copy Markdown
Contributor Author

this branch has been updated 24 times in the last 7 hours. i'm marking it as draft until it is ready for review.

Fair call. I rebased it too many times and that made the branch noisy to review.

It's now a single commit on top of current upstream/main. Also I see for some reason CI is failing now. I'll flip it back out of draft once after looking into it..

@skamenan7
Copy link
Copy Markdown
Contributor Author

The failing CI appears to be infra, not this change. test-matrix (remote::chromadb, 3.12) timed out pulling
chromadb/chroma:latest from Docker Hub (TLS handshake timeout), and the later log step failed because the
container never started. The rest of the PR is green though.

@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from ccaf562 to 65a68a5 Compare April 2, 2026 13:44
@skamenan7
Copy link
Copy Markdown
Contributor Author

Hi @mattf, CI is working now after a rebase triggered the CI. PTAL. I had been rebasing too often to keep the PR up to date hence many updates. I have not changed the implementation after opening at all. I hope I answered your questions above. If not, I can follow up more. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants