Skip to content

Conversation

@JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Oct 7, 2025

Summary

This PR introduces a proposal for the Virtual MCP Server, a new ToolHive feature that aggregates multiple MCP servers from a group into a single unified interface.

What does this PR do?

Adds a technical proposal document that outlines the design and implementation approach for the Virtual MCP Server (thv virtual command).

Key Design Decisions

  • Group-based architecture: Leverages existing ToolHive groups as the fundamental abstraction for managing backend MCP servers
  • Reuses existing constructs: Uses ToolsFilter for including specific tools and ToolOverride for renaming/redescribing tools
  • Per-backend authentication: Supports different authentication strategies per backend (pass-through, token exchange, service accounts, header injection)
  • Phased implementation: MVP focuses on basic aggregation, with composite tools and advanced features in later phases

Benefits

  • Simplifies client integration (single connection instead of multiple)
  • Centralizes authentication complexity handling
  • Enables cross-service workflows via composite tools
  • Maintains full backward compatibility with existing MCP clients and servers

Testing

This is a proposal document - implementation and testing will follow in subsequent PRs.

Related Issues

This proposal addresses the need for consolidating multiple MCP servers as discussed in the Virtual MCP design document.

@JAORMX JAORMX force-pushed the proposal/virtual-mcp-server branch 2 times, most recently from a2a5279 to 9312f00 Compare October 7, 2025 13:04
@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.61%. Comparing base (5948453) to head (53f8c5a).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2106   +/-   ##
=======================================
  Coverage   53.61%   53.61%           
=======================================
  Files         239      239           
  Lines       30704    30704           
=======================================
  Hits        16463    16463           
  Misses      13071    13071           
  Partials     1170     1170           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX requested a review from Copilot October 7, 2025 13:25
@JAORMX
Copy link
Collaborator Author

JAORMX commented Oct 7, 2025

@claude please review this

@claude
Copy link
Contributor

claude bot commented Oct 7, 2025

Claude finished @JAORMX's task —— View job


Code Review: Virtual MCP Server Proposal

I've reviewed the Virtual MCP Server proposal. This is a well-thought-out design that appropriately leverages existing ToolHive constructs. Here's my detailed feedback:


Strengths

1. Excellent reuse of existing patterns

  • Leverages groups.Manager and workloads.Manager (docs/proposals/virtual-mcp-server.md:121-124)
  • Reuses ToolsFilter and ToolOverride middleware (docs/proposals/virtual-mcp-server.md:136-138, 218-219)
  • These are battle-tested components already in production

2. Clear phased implementation

  • MVP focuses on core aggregation (docs/proposals/virtual-mcp-server.md:186-196)
  • Composite tools deferred to Phase 2 is pragmatic
  • Allows incremental delivery and validation

3. Good alignment with MCP specification

  • Maintains protocol compatibility (docs/proposals/virtual-mcp-server.md:13)
  • Proper handling of tools, resources, and prompts (docs/proposals/virtual-mcp-server.md:128-132)

Issues & Recommendations

1. Authentication design needs clarification ⚠️

Issue: The proposal mentions token exchange (docs/proposals/virtual-mcp-server.md:66-68, 144) but there's already a separate token exchange proposal. The relationship between these two features isn't clear.

Questions:

Recommendation: Add a section explicitly stating:

# References token exchange from docs/proposals/token-exchange-middleware.md
backend_auth:
  github:
    type: "token_exchange"
    # Uses existing token exchange middleware implementation

2. Group reference validation ⚠️

Issue: The proposal doesn't specify what happens if the referenced group doesn't exist or is empty (docs/proposals/virtual-mcp-server.md:38).

Recommendation: Add validation behavior:

  • Return clear error if group doesn't exist
  • Handle case where group has no running workloads
  • Specify whether virtual MCP should start automatically when group workloads start

3. Tool name conflict resolution needs more detail ⚠️

Issue: The conflict_resolution field mentions "prefix | priority | manual" (docs/proposals/virtual-mcp-server.md:42) but doesn't explain:

  • What prefix format is used (workload name? custom prefix?)
  • How priority order is determined
  • What "manual" means (presumably the override section?)

Recommendation: Expand with examples:

aggregation:
  conflict_resolution: "prefix"  # Adds workload name as prefix: github_create_pr
  # OR
  conflict_resolution: "priority"  # First workload wins, others ignored
  priority_order: ["github", "jira", "slack"]

4. Missing error handling specifications ⚠️

Open questions from docs/proposals/virtual-mcp-server.md:240-244:

  1. ✓ Streaming responses - acknowledged as open question
  2. ✓ Backend capabilities caching - acknowledged as open question
  3. ✓ Rate limits - acknowledged as open question
  4. Missing: What happens when a backend workload fails/stops?
  5. Missing: Partial failure handling (some backends succeed, others fail)
  6. Missing: Circuit breaker patterns for unhealthy backends

Recommendation: Add section on failure modes and degradation strategies.

5. CLI command naming 💭

Minor consideration: The command thv virtual (docs/proposals/virtual-mcp-server.md:169) could be more descriptive. Consider:

  • thv virtual-server - clearer that it's running a server
  • thv aggregate - describes the function
  • thv virtual is fine if brevity is preferred

6. Composite tools implementation complexity ⚠️

Concern: The composite tools feature (docs/proposals/virtual-mcp-server.md:91-104) is Phase 2 but may be architecturally complex:

  • Requires DAG execution for depends_on
  • Needs variable substitution ({{.params.pr_number}})
  • Error handling across multiple tool calls
  • Transaction semantics (what if step 2 fails after step 1 succeeds?)

Recommendation: Consider whether composite tools belong in a separate proposal, as they're almost a workflow engine feature rather than an aggregation feature.

7. Security considerations 🔒

Missing: Security implications of aggregation:

  • Can workloads in the same group belong to different security contexts?
  • Should there be authorization checks per backend in addition to virtual MCP auth?
  • How are secrets scoped across aggregated backends?

Recommendation: Add explicit security boundary documentation.

8. Performance metrics and observability 📊

Success criteria (docs/proposals/virtual-mcp-server.md:246-252) includes "< 10ms routing overhead" but:

  • No mention of how this will be measured
  • No observability/telemetry strategy for debugging multi-backend issues
  • No mention of distributed tracing across backends

Recommendation: Add observability requirements to implementation notes.


Minor Issues

9. Typo/Grammar

Line 238: "LLM-generated backend" - this alternative seems like a different feature entirely and may confuse readers. Consider removing or clarifying.

10. Implementation notes location

The "Reusing Existing Components" section (docs/proposals/virtual-mcp-server.md:214-223) is excellent but buried at the end. Consider moving it closer to "High-Level Design" since it explains how the pieces fit together.


Questions for Discussion

  1. Workload lifecycle: Should the virtual MCP automatically start/stop underlying workloads, or only work with already-running workloads?

  2. Dynamic membership: If workloads are added/removed from the group while virtual MCP is running, should it automatically update? (Mentioned as Phase 3, line 199)

  3. Client registration: Should the virtual MCP be registered with MCP clients separately from its backend workloads?

  4. Backwards compatibility: Can existing workloads be added to groups without modification, or do they need reconfiguration?


Overall Assessment

Verdict: ✅ Approve with minor revisions

This is a solid proposal that appropriately reuses existing ToolHive infrastructure. The phased approach is sensible, and the core aggregation feature is well-scoped. The main gaps are around error handling, security boundaries, and clarifying the relationship with the token exchange proposal.

Priority fixes before implementation:

  1. Clarify authentication strategy and token exchange integration (Issue fix(typo): corrects readme #1)
  2. Specify failure handling and degradation behavior (Issue Implement secret store #4)
  3. Expand conflict resolution details (Issue Bump golangci/golangci-lint-action from 2f856675483cb8b9378ee77ee0beb67955aca9d7 to 4696ba8babb6127d732c3c6dde519db15edab9ea #3)

Nice-to-haves:


Job Run | Branch: proposal/virtual-mcp-server

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a technical proposal for a Virtual MCP Server feature that consolidates multiple MCP servers from a ToolHive group into a single unified interface. The proposal aims to simplify client integration and centralize authentication complexity while maintaining backward compatibility.

  • Adds comprehensive design documentation for the Virtual MCP Server (thv virtual command)
  • Defines group-based architecture leveraging existing ToolHive constructs
  • Outlines phased implementation approach from basic aggregation to enterprise features

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@ChrisJBurns
Copy link
Collaborator

Should there be authorization checks per backend in addition to virtual MCP auth?

@JAORMX This part of Claudes review is what I was getting at in the vMCP GDoc. I'll explain more here just so it's visible to others.

If the benefit of vMCP itself is that it can aggregate 10 backing MCP servers, that's great, but let's think of the security implications / risks of this. In this case, a single vMCP server will have access to different 10 MCP servers. In the event that vMCP is ever exploited, would it be able to make calls to the backing MCP servers directly? Or will they themselves be protected by authn/z? If so, then this isn't as much of a concern from me.

@JAORMX
Copy link
Collaborator Author

JAORMX commented Oct 7, 2025

If the benefit of vMCP itself is that it can aggregate 10 backing MCP servers, that's great, but let's think of the security implications / risks of this. In this case, a single vMCP server will have access to different 10 MCP servers. In the event that vMCP is ever exploited, would it be able to make calls to the backing MCP servers directly? Or will they themselves be protected by authn/z? If so, then this isn't as much of a concern from me.

I see. So... we should clarify the architecture and deployment recommendations then. backed MCP servers should still not be deployed as they are without appropriate protections. That is, they should still enforce authentication and perhaps even authorization. If the vMCP is ever exploited, the backed MCP servers will still authenticate and authorize the request. I'm more keen on a zero-trust policy in general.

@ChrisJBurns
Copy link
Collaborator

That is, they should still enforce authentication and perhaps even authorization. If the vMCP is ever exploited, the backed MCP servers will still authenticate and authorize the request. I'm more keen on a zero-trust policy in general.

@JAORMX Yep, that would be my understanding (and hope) too! I had a feeling you thought the same way but I just wanted to triple check 😄

@JAORMX
Copy link
Collaborator Author

JAORMX commented Oct 7, 2025

@ChrisJBurns picture me with a tinfoil hat on at any given moment 😄

@jhrozek
Copy link
Contributor

jhrozek commented Oct 7, 2025

Should there be authorization checks per backend in addition to virtual MCP auth?

@JAORMX This part of Claudes review is what I was getting at in the vMCP GDoc. I'll explain more here just so it's visible to others.

If the benefit of vMCP itself is that it can aggregate 10 backing MCP servers, that's great, but let's think of the security implications / risks of this. In this case, a single vMCP server will have access to different 10 MCP servers. In the event that vMCP is ever exploited, would it be able to make calls to the backing MCP servers directly? Or will they themselves be protected by authn/z? If so, then this isn't as much of a concern from me.

Hmm, but the vMCP server would be an OAuth app so the user would connect with a token that says aud=vmcp right? And then the vMCP server would have to authenticate to each of the back end servers which means sending a token with aud=server1, aud=server2 if we are to reuse the usual authnz for the back end MCP servers. We could do that provided that the token exchange is set up to allow that. However, we'd have to send the external API token along with the request -or- let the back end token servers exchange the tokens they receive for external API tokens.

That would work if we can use token exchange all the way through, but not if we need to drive a consent and do the full flow (we would want to drive the consent chain to the consent on the vMCP server correct)? So in that case we'd have to send the back end server through a back channel?

@JAORMX JAORMX force-pushed the proposal/virtual-mcp-server branch 2 times, most recently from 13e4f9d to 6fc029f Compare October 14, 2025 12:15
JAORMX added a commit that referenced this pull request Oct 15, 2025
This proposal introduces MCPGroup support to the Kubernetes operator,
enabling the Virtual MCP Server feature and logical grouping of MCPServer
resources.

Key design decisions:
- Explicit group field on MCPServer spec (matches CLI behavior)
- Simple MCPGroup CRD with description field
- Status tracks server list and count
- Namespace-scoped groups only
- Enables field selector queries for group membership

Related: #2106
@JAORMX JAORMX force-pushed the proposal/virtual-mcp-server branch 2 times, most recently from a983db5 to f1f918c Compare October 20, 2025 09:29
@JAORMX JAORMX requested review from ChrisJBurns and jhrozek October 20, 2025 09:29
@JAORMX
Copy link
Collaborator Author

JAORMX commented Oct 20, 2025

@claude could you do another review?

@claude
Copy link
Contributor

claude bot commented Oct 20, 2025

Claude finished @JAORMX's task —— View job


Code Review: Virtual MCP Server Proposal

I've completed a thorough review of the Virtual MCP Server proposal. The document has been significantly enhanced since the last review, with excellent additions around elicitation support and composite tools. Here's my detailed feedback:


Summary of Changes Since Last Review

The most recent commit (f1f918c) added:

  • Elicitation support for composite tools (lines 246-260)
  • State persistence strategy for workflows (lines 334-336)
  • Incident investigation example demonstrating tool reuse (lines 140-183)
  • Enhanced composite tool example with user confirmations (lines 91-138)

These additions substantially improve the proposal's completeness and practicality.


Strengths

1. Excellent architectural reuse

  • Leverages existing groups.Manager and workloads.Manager (lines 201-204)
  • Reuses ToolsFilter and ToolOverride middleware (lines 214-218)
  • Builds on proven ToolHive patterns rather than inventing new ones

2. Clear phased implementation

  • MVP focuses on core aggregation without overcommitting (lines 283-288)
  • Advanced features properly deferred to Phase 2 (lines 290-294)
  • Allows for incremental delivery and validation

3. Elicitation integration is well-designed

  • Follows MCP specification for elicitation (line 247)
  • Proper handling of accept/decline/cancel actions (lines 253-260)
  • Good balance between flexibility and simplicity

4. Practical examples

  • Incident investigation example (lines 140-183) effectively demonstrates tool reuse
  • Deployment workflow (lines 91-138) shows realistic use case with confirmations

Critical Issues

1. Token Exchange Integration Needs Clarification ⚠️

Issue: The proposal mentions token_exchange authentication (lines 66-68, 224) but doesn't reference the existing token exchange proposal (THV-2063-token-exchange-middleware.md).

Current gap:

  • THV-2063 defines token exchange for single backend scenarios
  • This proposal needs token exchange for multiple backends simultaneously
  • No explanation of how to configure different audiences/client IDs per backend

Questions:

  1. Will each backend need its own OAuth client registration?
  2. How does the config map token exchange credentials per backend?
  3. Can backends share the same IDP but different audiences?
  4. What happens when backends use different IDPs entirely?

Recommendation: Add a section explicitly linking to THV-2063:

backend_auth:
  github:
    type: "token_exchange"
    # Reuses token exchange implementation from THV-2063-token-exchange-middleware.md
    token_exchange:
      token_url: "https://github.com/oauth/token"
      audience: "github-api"
      client_id: "github-client"
      client_secret_ref:
        name: "github-oauth"
        key: "client_secret"
      scopes: "repo read:org"

Also address:

  • Reference THV-2063 in Implementation Notes section (around line 323)
  • Add note about per-backend token caching strategy
  • Clarify if Virtual MCP needs its own client registration separate from backends

2. Security Model Insufficiently Addressed 🔒

Issue: The security discussion from PR comments (ChrisJBurns, jhrozek) about zero-trust and per-backend authz is not reflected in the proposal.

Missing security considerations:

  1. Zero-trust architecture: Proposal should state that backend MCP servers MUST still enforce their own authn/authz even when accessed via Virtual MCP
  2. Security boundary documentation: Where are trust boundaries? Can Virtual MCP be exploited to bypass backend security?
  3. Credential scope isolation: How are secrets scoped to prevent cross-contamination?
  4. Authorization policies: Should Virtual MCP enforce Cedar policies in addition to backend policies?

From PR discussion (ChrisJBurns at 2025-10-07T13:46:40Z):

In the event that vMCP is ever exploited, would it be able to make calls to the backing MCP servers directly?

From PR discussion (JAORMX at 2025-10-07T13:49:36Z):

backed MCP servers should still not be deployed as they are without appropriate protections. [...] I'm more keen on a zero-trust policy in general.

Recommendation: Add explicit Security section:

## Security Considerations

### Zero-Trust Architecture

Virtual MCP operates on a zero-trust security model:

1. **Backend servers remain protected**: All backend MCP servers MUST continue to enforce their own authentication and authorization, independent of Virtual MCP's auth layer
2. **Defense in depth**: Virtual MCP compromise does not grant automatic access to backends
3. **Credential isolation**: Each backend receives only credentials scoped for that specific service
4. **Audit trail preservation**: Backend servers maintain their own audit logs in addition to Virtual MCP's aggregated logging

### Security Boundaries

┌─────────────────────────────────────────┐
│ Client (authenticated to Virtual MCP) │
└──────────────┬──────────────────────────┘
│ Auth boundary #1
┌──────────────▼──────────────────────────┐
│ Virtual MCP Server │
│ - Validates client token │
│ - Exchanges/transforms per backend │
└──┬───────────┬────────────┬──────────────┘
│ │ │ Auth boundary #2
┌──▼───┐ ┌───▼────┐ ┌────▼─────┐
│GitHub│ │ Jira │ │ Slack │
│(auth)│ │ (auth) │ │ (auth) │
└──────┘ └────────┘ └──────────┘


### Deployment Recommendations

- Backend MCP servers SHOULD NOT be directly accessible to clients (network isolation)
- Virtual MCP SHOULD run with minimal privileges (least privilege principle)
- Secrets for backend authentication MUST be stored in secure secret backends (1Password, encrypted storage)
- Virtual MCP SHOULD enforce additional authorization policies using Cedar before routing requests

3. Conflict Resolution Underspecified ⚠️

Issue: Line 42 mentions conflict_resolution: "prefix" | priority | manual but doesn't explain the behavior.

Questions:

  1. prefix: What prefix format? Workload name? Custom prefix?
  2. priority: How is priority order determined? First-wins? Configuration-based?
  3. manual: Is this referring to the overrides section?

Recommendation: Expand with concrete examples:

aggregation:
  # Option 1: Automatic prefixing with workload name
  conflict_resolution: "prefix"
  # Results in: github_create_pr, jira_create_pr

  # Option 2: Priority-based (first backend wins)
  conflict_resolution: "priority"
  priority_order: ["github", "jira", "slack"]
  # If both github and jira expose "create_issue", use github's version

  # Option 3: Manual resolution via overrides
  conflict_resolution: "manual"  # Requires explicit overrides for conflicts
  # Throws error if tools conflict without override

Major Issues

4. Error Handling and Failure Modes Inadequate ⚠️

Current state: Open questions section (lines 343-347) acknowledges some gaps but doesn't address critical failure scenarios.

Missing specifications:

  1. Backend workload failure: What if a backend stops/crashes during Virtual MCP operation?
  2. Partial failures: Some backends succeed, others timeout - how to handle?
  3. Circuit breaker patterns: Should Virtual MCP stop routing to unhealthy backends?
  4. Composite tool rollback: If step 3 fails, what happens to steps 1-2?
  5. Elicitation timeout: User doesn't respond - how long to wait?
    • ✅ Partially addressed: Line 336 mentions 30-minute workflow timeout
    • ❌ Missing: What happens to backend state if elicitation times out mid-workflow?

Recommendation: Add Failure Handling section:

### Failure Handling

#### Backend Unavailability
- **Detection**: Health checks every 30s against backend `/health` endpoint
- **Behavior**: Mark backend as unavailable, remove its tools from capability list
- **Recovery**: Automatic re-inclusion when health check succeeds
- **Client notification**: Return error with unavailable backend list

#### Partial Failures
- **Tool calls**: Return error immediately if target backend fails
- **Capability aggregation**: Succeed with partial capability list if some backends fail
- **Composite tools**: Abort workflow on first step failure, return error with completed steps

#### Circuit Breaker
- **Threshold**: 5 consecutive failures or 50% error rate over 1 minute
- **Open state**: Stop routing to backend for 60 seconds
- **Half-open**: Allow 1 request through to test recovery

5. Composite Tool Complexity May Need Separate Proposal 🤔

Concern: Composite tools (lines 91-138, 245-260) introduce significant complexity:

  • DAG execution with depends_on
  • Template variable substitution ({{.params.x}}, {{.steps.y.output}})
  • Conditional execution based on expressions
  • Error handling across multiple tool calls
  • State persistence for long-running workflows (lines 334-336)
  • Elicitation protocol integration

This is essentially a workflow engine, which is a different problem domain than request aggregation/routing.

Risk: Combining both in one implementation may:

  • Delay MVP delivery
  • Complicate testing
  • Mix concerns (routing vs orchestration)

Recommendation: Consider one of:

  1. Separate proposal: Move composite tools to THV-XXXX-workflow-engine.md
  2. Simplified Phase 2: Defer complex features (DAG execution, conditionals, state persistence) to Phase 3
  3. Keep as-is: If composite tools are core to Virtual MCP value prop (which they are, based on benefits section line 306)

Given the emphasis in benefits section, keeping composite tools makes sense, but:

  • Add note acknowledging this is a significant undertaking
  • Consider MVP composite tools without elicitation/conditionals/persistence
  • Add separate success criteria for composite tool features

6. State Persistence Design Insufficient ⚠️

Issue: Line 334-336 mentions "in-memory with clean storage interface" but doesn't address:

  1. Memory limits: What if 1000 workflows are running? Memory exhaustion?
  2. Server restart: All workflows lost?
  3. Distributed deployment: Can't share state across multiple Virtual MCP instances
  4. Debugging: How to inspect workflow state for troubleshooting?

Recommendation: Either:

  • Option A: Commit to persistent storage from Phase 2 (Redis, PostgreSQL, etcd)
  • Option B: Add clear limitations to documentation:
    ### Phase 2 State Persistence Limitations
    
    - **Single instance only**: No horizontal scaling
    - **Ephemeral workflows**: Server restart loses all in-flight workflows
    - **Memory bounded**: Maximum 1000 concurrent workflows (configurable)
    - **No debugging tools**: State inspection requires restart with debug logging
    
    Phase 3 will migrate to persistent storage for production deployments.

Minor Issues

7. Group Reference Validation ℹ️

Issue: Line 38 references group: "engineering-team" but doesn't specify validation behavior.

Add:

#### Group Validation
- Error if referenced group doesn't exist: `group 'engineering-team' not found`
- Error if group is empty: `group 'engineering-team' has no workloads`
- Option to auto-start workloads: `--auto-start-workloads` flag

8. CLI Command Naming 💭

Minor: thv virtual (line 266) could be clearer. Consider:

  • thv virtual-server - explicit that it's running a server
  • thv aggregate - describes the function
  • Current name is fine if brevity preferred

9. Observability Strategy Missing 📊

Issue: Success criteria (lines 349-355) mentions "< 10ms routing overhead" but no mention of how to measure this.

Add to Implementation Notes:

### Observability

- **Metrics**: OpenTelemetry metrics for routing latency, backend health, tool call distribution
- **Tracing**: Distributed traces spanning Virtual MCP → Backend calls
- **Logging**: Structured logs with correlation IDs for multi-backend requests
- **Dashboards**: Pre-built Grafana dashboard for Virtual MCP monitoring

10. Streaming Responses Open Question

Note: Line 345 asks "How to handle streaming responses across multiple backends?"

This is actually a critical decision for Phase 1, not Phase 3, because:

  • MCP clients may expect streaming responses
  • If unsupported in MVP, needs clear documentation
  • May affect transport selection (SSE vs HTTP)

Recommendation: Decide for Phase 1:

  • Option A: Only support non-streaming backends in MVP
  • Option B: Support streaming but only single-backend streaming (not merged streams)
  • Option C: Full streaming support with multiplexed channels (complex)

Add to MVP scope: "Phase 1 supports streaming from individual backends but does not merge streaming responses across multiple backends."


Additional Comments

11. Token Exchange vs Other Auth Types 🔐

The proposal lists 5 auth types (lines 223-227) but only token_exchange is well-defined via THV-2063:

  • pass_through: Clear enough
  • service_account: Where are credentials stored? How refreshed?
  • header_injection: Similar to service_account but different header?
  • mapped_claims: Requires JWT manipulation - complex and risky

Recommendation: For Phase 1, support only:

  • pass_through
  • token_exchange (via THV-2063)

Defer service_account, header_injection, mapped_claims to Phase 2.

12. Elicitation Example Could Be Clearer 💡

The elicitation example (lines 102-122) is good but could emphasize the user interaction flow:

**User experience**:
1. User calls `deploy_and_notify` with pr_number=123
2. Virtual MCP merges PR (step 1)
3. Virtual MCP prompts user: "PR 123 merged. Deploy to [Staging/Production]? [Notify team: yes/no]"
4. User selects "Production" and "yes"
5. Virtual MCP deploys (step 3) and notifies (step 4)

13. Relationship to MCPRemoteProxy CRD 🤔

I notice a recent commit mentions MCPRemoteProxy CRD (from git log). Is there overlap with Virtual MCP? Should the proposal mention this?

If MCPRemoteProxy is for Kubernetes operator and thv virtual is for CLI, clarify the relationship:

### Kubernetes Deployment

Virtual MCP can be deployed in Kubernetes via the `MCPRemoteProxy` CRD (see THV-XXXX for details). The CLI `thv virtual` command is intended for local development and single-machine deployments.

Overall Assessment

Verdict: ✅ Approve with significant revisions

This is a strong proposal with a well-thought-out design that appropriately leverages existing ToolHive infrastructure. The recent additions around elicitation and state persistence show good evolution based on feedback.

However, there are critical gaps that must be addressed before implementation:

Blocking Issues (must fix before Phase 1 implementation):

  1. Security model (Issue Do we want the container monitor? #2) - Add explicit zero-trust security section
  2. Token exchange integration (Issue fix(typo): corrects readme #1) - Link to THV-2063 and show per-backend config
  3. Conflict resolution details (Issue Bump golangci/golangci-lint-action from 2f856675483cb8b9378ee77ee0beb67955aca9d7 to 4696ba8babb6127d732c3c6dde519db15edab9ea #3) - Define prefix/priority/manual behaviors
  4. Streaming support decision (Issue Implement secrets management #10) - Clarify MVP streaming support

Important Issues (should address before Phase 2):

  1. ⚠️ Error handling specification (Issue Implement secret store #4) - Add failure modes and degradation
  2. ⚠️ State persistence limitations (Issue Figure out ergonomics for exposing directories #6) - Document in-memory constraints
  3. ⚠️ Composite tool scope (Issue Implement secret injection #5) - Consider complexity vs MVP timeline

Nice-to-haves (can defer to implementation):

  1. ℹ️ Group validation (Issue Create proxy subcommand #7)
  2. 📊 Observability strategy (Issue Use an actual logger #9)
  3. 🔐 Simplified auth types for Phase 1 (Issue feat: Add directory mounting capabilities to vt run and registry run #11)

Recommendations for Next Steps

  1. Address blocking issues: Add security section, token exchange details, conflict resolution examples
  2. Decide on composite tools: Keep in proposal but simplify Phase 1 scope? Or separate proposal?
  3. Update implementation phases: Based on composite tool decision, revise MVP scope
  4. Link related proposals: Reference THV-2063 (token exchange), mention MCPRemoteProxy if relevant
  5. Consider creating follow-up proposals:
    • THV-XXXX-virtual-mcp-observability.md (if observability is complex)
    • THV-XXXX-virtual-mcp-security-model.md (if security warrants detailed design)

Additional Context from PR Discussion

From the PR comments, I see important security discussion between @ChrisJBurns, @JAORMX, and @jhrozek about:

The proposal should explicitly address the OAuth consent question from @jhrozek:

we'd have to send the external API token along with the request -or- let the back end token servers exchange the tokens they receive for external API tokens

This is a critical architectural decision about where token exchange happens (Virtual MCP vs backends).


Job Run | Branch: proposal/virtual-mcp-server

@JAORMX JAORMX force-pushed the proposal/virtual-mcp-server branch from f1f918c to ac1d31f Compare October 20, 2025 11:20
@JAORMX
Copy link
Collaborator Author

JAORMX commented Oct 21, 2025

@claude please do another review

@claude
Copy link
Contributor

claude bot commented Oct 21, 2025

Claude finished @JAORMX's task —— View job


Code Review: Virtual MCP Server Proposal

I've completed a comprehensive review of the Virtual MCP Server proposal. This is a substantial and well-structured document that has clearly evolved based on previous feedback. Here's my detailed assessment:


Overall Assessment

Verdict: ✅ Strong proposal with excellent improvements

This proposal demonstrates thoughtful design and has addressed many concerns from the previous review. The document is now significantly more comprehensive, particularly with:

  • Explicit security boundaries (lines 128-248)
  • Token exchange integration with THV-2063 (lines 950-963)
  • Detailed conflict resolution strategies (lines 700-778)
  • Comprehensive error handling (lines 781-853)
  • Elicitation support for interactive workflows (lines 109-125)

Key Strengths

1. Excellent Security Architecture ✅ (lines 128-248)

The proposal now includes a thorough security section addressing the zero-trust concerns raised by @ChrisJBurns and @JAORMX:

  • Two authentication boundaries clearly defined (lines 133-171)
  • Zero-trust security model explicitly documented (lines 173-214)
  • Defense in depth approach with independent backend validation
  • Network isolation recommendations

This directly addresses the PR discussion about whether backends should enforce their own auth even when accessed via Virtual MCP. The answer is clearly "yes" (line 179: "Each backend MCP server MUST enforce its own authentication").

2. Token Exchange Integration Well-Documented ✅ (lines 950-963)

The proposal now properly references THV-2063 and explains:

  • Reuse of pkg/auth/tokenexchange implementation
  • Per-backend exchange patterns
  • Discovery mode for Kubernetes (lines 643-698)
  • Detailed sequence diagram (lines 218-247)

This resolves the previous review's Issue #1 about token exchange clarity.

3. Conflict Resolution Fully Specified ✅ (lines 700-778)

All three strategies are now well-documented with examples:

  • Prefix strategy: Format options and behavior (lines 704-726)
  • Priority strategy: First-wins approach (lines 728-744)
  • Manual strategy: Explicit override requirements (lines 746-777)

This resolves the previous review's Issue #3.

4. Comprehensive Error Handling ✅ (lines 781-853)

The proposal now covers:

  • Backend unavailability detection and recovery (lines 783-797)
  • Partial failure modes with configuration (lines 799-816)
  • Circuit breaker patterns (lines 818-835)
  • Composite tool failure handling (lines 837-853)

This addresses the previous review's Issue #4 about failure modes.

5. Practical Examples

The incident investigation example (lines 427-463) effectively demonstrates tool reuse with multiple calls to the same backend tool with different URLs. This is a compelling use case.


Issues and Recommendations

Critical Issue #1: Token Exchange Flow Security Gap ⚠️

Location: Lines 218-247 (sequence diagram)

Issue: The token exchange flow shows the Virtual MCP exchanging tokens with the IDP, but @jhrozek raised an important question in the PR comments:

Hmm, but the vMCP server would be an OAuth app so the user would connect with a token that says aud=vmcp right? And then the vMCP server would have to authenticate to each of the back end servers which means sending a token with aud=server1, aud=server2...

This is correctly shown in the diagram, BUT:

The consent chain problem: When using token exchange, does the user need to consent separately for each backend service? Or does consenting to Virtual MCP implicitly grant consent for all backends?

OAuth security consideration: If Virtual MCP can exchange aud=vmcp tokens for aud=github-api tokens without additional consent, this effectively grants Virtual MCP carte blanche to act on behalf of the user for any backend. This could be a security concern.

Recommendation: Add a section explaining consent semantics:

### OAuth Consent and Scope Delegation

When using token exchange, Virtual MCP relies on the IDP's token exchange policy to determine what backends can be accessed:

**Option A: IDP-level authorization (Recommended)**
- IDP enforces allowed audiences during token exchange
- Virtual MCP token exchange client has limited delegation rights
- Token exchange fails if user hasn't consented to backend service
- User manages consent through IDP

**Option B: Implicit consent via Virtual MCP**
- User consents to Virtual MCP with broad scopes
- Virtual MCP can exchange for any configured backend
- Simpler UX but concentrates trust in Virtual MCP
- Suitable for internal/trusted deployments

The configuration should specify which model is used per deployment.

Critical Issue #2: Backend Authentication Discovery Ambiguity ⚠️

Location: Lines 525-531, 643-698

Issue: The "discovered" mode for Kubernetes is excellent, but there's ambiguity about what happens when:

  1. Backend has externalAuthConfigRef but Virtual MCP outgoingAuth.backends also defines that backend
  2. Backend has no externalAuthConfigRef and Virtual MCP has no explicit config
  3. Backend's auth config changes while Virtual MCP is running

Current text (line 537):

Optional: Override specific backends (works in all modes)

This suggests override behavior but doesn't specify precedence clearly.

Recommendation: Add explicit precedence rules:

### Backend Authentication Resolution (Kubernetes)

When `source: discovered` or `source: mixed`, Virtual MCP resolves backend authentication in this order:

1. **Explicit override** in `spec.outgoingAuth.backends[name]` (highest priority)
2. **Discovered config** from `MCPServer.spec.externalAuthConfigRef`
3. **Default behavior** from `spec.outgoingAuth.default`
4. **Error** if none of the above match

**Dynamic updates:**
- Changes to `MCPExternalAuthConfig` are not automatically reflected in running Virtual MCP
- Virtual MCP must be restarted or trigger a reconciliation to pick up changes
- This is a Phase 3 enhancement (line 935: "Dynamic configuration updates")

Major Issue #3: Composite Tool State Persistence Limitations Understated ⚠️

Location: Lines 976-992

Issue: The proposal says "In-Memory Storage" is the Phase 2 approach (line 980) but doesn't adequately warn about the significant limitations this creates:

  1. No horizontal scaling: Can't run multiple Virtual MCP instances
  2. State loss on restart: All in-flight workflows lost
  3. No debugging: Can't inspect workflow state externally
  4. Memory exhaustion: Unbounded workflow accumulation

The workflow timeout of 30 minutes (line 988) helps but doesn't solve these problems.

Recommendation: Either:

Option A: Move persistent storage to Phase 2 (recommended for production readiness)

  • Composite tools are a killer feature (benefits section, lines 929-936)
  • In-memory state severely limits production viability
  • Redis/etcd integration is relatively straightforward

Option B: Add explicit "Phase 2 Limitations" warning:

### Phase 2 State Persistence Limitations

**IMPORTANT**: In-memory state persistence in Phase 2 has significant operational limitations:

-**No horizontal scaling**: Single instance only (cannot deploy multiple replicas)
-**No crash recovery**: Server restart loses all in-flight workflows  
-**Limited debugging**: No external workflow state inspection
-**Memory bounded**: Risk of memory exhaustion with many concurrent workflows

**Production Deployments**: For production use of composite tools, we strongly recommend waiting for Phase 3 persistent storage (Redis/PostgreSQL) or accepting these limitations with appropriate monitoring.

**Workarounds**:
- Deploy with high memory limits and pod anti-affinity
- Monitor in-flight workflow count
- Set aggressive workflow timeouts
- Use Phase 1 (simple aggregation) for production, Phase 2 for dev/staging

Major Issue #4: Streaming Response Strategy Missing 📊

Location: Line 1002 (open question)

Issue: The proposal lists "Streaming responses across multiple backends" as an open question, but this is actually critical for Phase 1 MVP because:

  1. MCP clients may expect streaming responses for tool calls
  2. Many modern MCP servers use SSE (Server-Sent Events) transport
  3. If Virtual MCP doesn't support streaming, it may break existing clients

This affects MVP scope and should not be deferred to Phase 3.

Recommendation: Make a decision for Phase 1:

### Streaming Response Support (Phase 1 Decision)

**Phase 1 Strategy**: Per-backend streaming support only

- **Single backend streaming**: Virtual MCP forwards streaming responses from individual backends
- **Multiple backend aggregation**: Capability discovery (list tools) returns buffered responses
- **Composite tools**: Phase 1 does not support streaming for composite tool steps

**Technical Approach**:
- Virtual MCP implements pass-through streaming for single tool calls
- Uses chunked transfer encoding or SSE depending on client transport
- Buffers responses only when aggregation across backends is required

**Phase 3 Enhancement**: True multi-backend streaming with multiplexed channels

Major Issue #5: Token Cache Security Not Addressed 🔒

Location: Lines 856-905 (Token Caching Architecture)

Issue: The token cache section is excellent for performance but doesn't address security implications:

  1. Cache key includes subject token hash (line 868) - what if hash collisions?
  2. Redis cache (Phase 2) - how is Redis secured? TLS? Authentication?
  3. Token in cache - are cached tokens encrypted at rest?
  4. Cache poisoning - what prevents malicious cache entries?
  5. Cross-user leakage - what prevents user A from accessing user B's cached token?

Recommendation: Add security section to token caching:

### Token Cache Security

**Cache Key Isolation**: SHA-256 hash prevents collisions (probability < 10^-60 for practical key spaces)

**Redis Security** (Phase 2):
- TLS required for Redis connections (`rediss://` protocol)
- Redis authentication via password
- Network policies restrict Redis access to Virtual MCP pods only

**Token Encryption**:
- Cached tokens stored in plaintext in cache (performance trade-off)
- Cache backend must be trusted and secured
- Redis ACLs limit Virtual MCP to its key namespace only

**Threat Model**:
- **Compromised Redis**: Attacker gains cached tokens (limited lifetime)
  - Mitigation: Short TTLs, Redis security, monitoring
- **Cache poisoning**: Not possible (Virtual MCP is sole writer)
- **Cross-user access**: Prevented by subject token hash in key

**Deployment Recommendations**:
- Use TLS for all Redis connections
- Enable Redis AUTH
- Set `maxmemory-policy` to `allkeys-lru` for automatic eviction
- Monitor cache hit rates and token expiry

Minor Issue #6: CLI vs Kubernetes Configuration Drift Risk ⚠️

Location: Lines 254-484 (CLI), 488-641 (Kubernetes)

Issue: The proposal maintains two parallel configuration formats (YAML for CLI, CRD for Kubernetes). While the document states they share a "unified configuration model" (lines 964-975), there's risk of drift:

  • CLI uses snake_case (e.g., incoming_auth)
  • K8s uses camelCase (e.g., incomingAuth)
  • CLI uses inline source only (line 286)
  • K8s supports discovered source (line 527)

Recommendation: Add a configuration compatibility matrix:

### Configuration Feature Matrix

| Feature | CLI Support | Kubernetes Support | Notes |
|---------|-------------|-------------------|-------|
| Incoming auth (OIDC) ||| Equivalent |
| Token exchange ||| CLI=inline only, K8s=discovered/inline/mixed |
| Service account auth ||| Equivalent |
| Token cache (memory) ||| Equivalent |
| Token cache (Redis) ||| Equivalent |
| Backend auth discovery ||| K8s-only feature |
| Composite tools ||| CLI-only in Phase 2, K8s in Phase 3 |

**Rationale**: Backend discovery requires Kubernetes API access to list MCPServer resources.
Composite tools deferred for K8s due to state persistence requirements.

Minor Issue #7: Observability Strategy Still Missing 📊

Previous review identified this (Issue #9), and it's still not addressed. Success criteria mentions "< 10ms routing overhead" (but I don't see this in the current version) and no details on measurement.

Recommendation: Add observability section:

### Observability and Telemetry

Virtual MCP emits structured logs, metrics, and traces using OpenTelemetry (THV-0597):

**Metrics** (Prometheus format):
- `vmcp_requests_total{backend, tool, status}` - Request counters
- `vmcp_request_duration_seconds{backend}` - Request latency histogram
- `vmcp_token_exchange_total{backend, status}` - Token exchange counters
- `vmcp_token_cache_hits_total` / `_misses_total` - Cache performance
- `vmcp_backend_health{backend}` - Backend health status (0=unhealthy, 1=healthy)
- `vmcp_composite_workflows_active` - In-flight composite tool workflows
- `vmcp_composite_workflows_completed{status}` - Workflow outcomes

**Traces** (OTLP):
- Distributed traces span Virtual MCP → Backend calls
- Trace IDs propagated to backends via `traceparent` header
- Each composite tool step creates a child span

**Logs** (structured JSON):
- Request/response logging with sanitized data
- Token exchange events (success/failure)
- Backend health check results
- Circuit breaker state transitions
- Composite tool workflow lifecycle events

**Dashboards**:
- Pre-built Grafana dashboard included in `deploy/grafana/`
- Metrics aligned with ToolHive observability proposal (THV-0597)

Minor Issue #8: Relationship to MCPRemoteProxy Not Clarified 🤔

Context: The git log and proposal THV-2151 mention MCPRemoteProxy CRD. There's potential confusion about the relationship:

  • Virtual MCP: Aggregates multiple MCP servers from a ToolHive group
  • Remote Proxy: Proxies a single remote MCP server with auth

Are these related? Complementary? Should Virtual MCP be able to aggregate Remote Proxies?

Recommendation: Add a "Related Work" section:

### Related Work and Comparisons

**THV-2151: MCPRemoteProxy**
- **Purpose**: Proxy a single remote MCP server (e.g., external SaaS MCP)
- **Difference**: Virtual MCP aggregates multiple servers; Remote Proxy connects to one external server
- **Relationship**: Virtual MCP can include backends that are MCPRemoteProxy instances
- **Use case**: Virtual MCP aggregating both local and remote MCP servers

**THV-2063: Token Exchange Middleware**
- **Purpose**: Exchange tokens for a single backend
- **Difference**: Virtual MCP performs token exchange for multiple backends
- **Relationship**: Virtual MCP reuses the token exchange implementation
- **Use case**: Virtual MCP's per-backend authentication

**Kubernetes MCPGroup CRD** (kubernetes-mcpgroup-crd.md)
- **Purpose**: Logical grouping of MCP servers
- **Difference**: Virtual MCP references existing groups
- **Relationship**: Virtual MCP leverages MCPGroup for backend discovery
- **Use case**: Virtual MCP's `groupRef` field

Minor Issue #9: Elicitation Timeout Handling Incomplete ⏱️

Location: Lines 109-125, 411-417

Issue: Elicitation timeout is mentioned (line 413: timeout: "5m") and overall workflow timeout (line 390: timeout: "30m"), but behavior isn't fully specified:

  1. What happens to backend state if elicitation times out mid-workflow?
  2. Does timeout trigger on_cancel or a different handler?
  3. Are partial results returned or is the entire workflow aborted?

Recommendation: Clarify elicitation timeout behavior:

### Elicitation Timeout Behavior (lines 411-417)

When an elicitation times out:
1. **Workflow status**: Set to `timeout` (similar to `cancel`)
2. **Backend state**: Previous completed steps remain (no rollback)
3. **Response to client**: Error response with partial results from completed steps
4. **Handler**: Triggers `on_cancel` handler (line 416)
5. **Cleanup**: Workflow state removed from memory after response sent

**Timeout Hierarchy**:
- Per-elicitation timeout (e.g., 5m): Specific to one user prompt
- Workflow timeout (e.g., 30m): Overall composite tool execution
- Backend timeout (e.g., 30s-45s): Individual tool call to backend

If per-elicitation timeout exceeds workflow timeout, workflow timeout takes precedence.

Security Discussion from PR Comments

The PR discussion between @ChrisJBurns, @JAORMX, and @jhrozek raises important points that the proposal now addresses:

✅ Zero-Trust Model (ChrisJBurns concern)

PR Comment: "In the event that vMCP is ever exploited, would it be able to make calls to the backing MCP servers directly?"

Proposal Response (lines 179-188):

  • Each backend MUST enforce its own auth (line 179)
  • Token exchange requires valid user token (line 184)
  • Backend servers independently validate tokens (line 185)
  • Credentials are scoped per backend (line 189)

Assessment: Well addressed

⚠️ Token Exchange Consent Flow (jhrozek concern)

PR Comment: "However, we'd have to send the external API token along with the request -or- let the back end token servers exchange the tokens they receive for external API tokens."

Proposal Response: Lines 218-247 show token exchange flow, but...

Gap: Doesn't address whether consent chain is required for each backend or if Virtual MCP consent implies all backends.

Assessment: Partially addressed - see Critical Issue #1 above


Suggested Additions

1. Success Criteria Metrics (lines 1014-1036)

Current success criteria are feature-based. Add measurable criteria:

### Measurable Success Criteria

**Performance** (Phase 1):
- Routing overhead < 10ms (p99)
- Request latency increase < 5% vs direct backend calls
- Capacity: 1000 requests/second per Virtual MCP instance

**Reliability** (Phase 2):
- Backend health check accuracy > 99%
- Circuit breaker prevents cascading failures
- Composite tool success rate > 95% (excluding user cancellations)

**Security** (All Phases):
- Zero unauthorized backend access (audit logs verification)
- Token exchange failures result in request denial (no fallback)
- All backend tokens have correct audience claim

**Scalability** (Phase 3):
- Horizontal scaling to 10+ instances with Redis cache
- Cache hit rate > 80% for repeated requests
- Workflow state survives instance restarts

2. Migration Path (for existing deployments)

### Migration from Direct Backend Access

For organizations currently accessing MCP servers directly:

**Phase 1: Add Virtual MCP alongside existing servers**
1. Deploy Virtual MCP referencing existing group
2. Test with subset of users/clients
3. Verify tool filtering and overrides work correctly
4. Run dual-stack (direct + virtual) during validation

**Phase 2: Migrate authentication**
5. Configure per-backend token exchange
6. Test authentication flow with real users
7. Validate audit logging captures user identity

**Phase 3: Full cutover**
8. Update client configurations to Virtual MCP endpoint
9. (Optional) Disable direct backend access via network policies
10. Monitor metrics for regressions

**Rollback Plan**:
- Keep direct backend endpoints active during migration
- Virtual MCP is purely additive (no changes to backends)
- Clients can switch back to direct access if issues arise

Final Recommendations

Before Implementation Starts:

  1. Resolve Critical Issue fix(typo): corrects readme #1 (consent chain): Clarify OAuth consent semantics for token exchange
  2. Resolve Critical Issue Do we want the container monitor? #2 (auth discovery): Document explicit precedence rules
  3. Decide on Major Issue Implement secret store #4 (streaming): Make Phase 1 streaming decision
  4. Address Major Issue Bump golangci/golangci-lint-action from 2f856675483cb8b9378ee77ee0beb67955aca9d7 to 4696ba8babb6127d732c3c6dde519db15edab9ea #3 (state persistence): Consider moving persistent storage to Phase 2 or add clear limitations warning

Before Phase 2 Implementation:

  1. Address Major Issue Implement secret injection #5 (cache security): Add Redis security requirements
  2. Address Minor Issue Create proxy subcommand #7 (observability): Define metrics and tracing strategy
  3. Address Minor Issue Use an actual logger #9 (elicitation timeout): Clarify timeout behavior

Nice-to-Have Improvements:

  1. Minor Issue Figure out ergonomics for exposing directories #6: Add configuration feature matrix
  2. Minor Issue Start simple hardcoded registry #8: Clarify relationship to MCPRemoteProxy
  3. Suggested Addition fix(typo): corrects readme #1: Add measurable success criteria
  4. Suggested Addition Do we want the container monitor? #2: Add migration path documentation

Summary

This is a strong, well-thought-out proposal that has evolved significantly based on feedback. The security model is now explicit and appropriate, token exchange integration is clear, and error handling is comprehensive.

The proposal is ready to move forward with the following caveats:

  1. Critical issues fix(typo): corrects readme #1-2 should be resolved before implementation to avoid rework
  2. Major issue Implement secret store #4 (streaming) is a blocker for MVP and needs immediate decision
  3. Major issue Bump golangci/golangci-lint-action from 2f856675483cb8b9378ee77ee0beb67955aca9d7 to 4696ba8babb6127d732c3c6dde519db15edab9ea #3 (state persistence) may limit Phase 2 production viability - consider prioritizing persistent storage

Key Strengths:

  • Excellent security architecture with zero-trust model
  • Practical reuse of existing ToolHive components
  • Comprehensive error handling and failure modes
  • Clear phased approach with reasonable scope

Recommended Next Steps:

  1. Update proposal to address critical issues fix(typo): corrects readme #1-2
  2. Make streaming support decision (Major issue Implement secret store #4)
  3. Consider persistent storage timeline (Major issue Bump golangci/golangci-lint-action from 2f856675483cb8b9378ee77ee0beb67955aca9d7 to 4696ba8babb6127d732c3c6dde519db15edab9ea #3)
  4. Proceed with Phase 1 implementation

Job Run | Branch: proposal/virtual-mcp-server


# ===== TOKEN CACHING =====
token_cache:
provider: memory # memory | redis | custom
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be encrypted given it will be a token cache

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the memory one, or in general?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general


2. **Outgoing Authentication** (Virtual MCP → Backends)
- **Purpose**: Per-backend token exchange for accessing backend services
- **Token Audience**: Backend-specific (e.g., `aud=github-api`, `aud=jira-api`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused when you say github-api do you mean a token issued by github's IDP meant to access github's API or a token issued by the same IDP as the vMCP meant to access the back end MCP server?

- **Configuration**: `incomingAuth` (K8s) / `incoming_auth` (CLI)
- **Implementation**: Uses existing ToolHive auth middleware (OIDC, local, anonymous)

2. **Outgoing Authentication** (Virtual MCP → Backends)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but let's use "Back end MCP" here (see my question below)


- **accept**: User provided data (accessible via `{{.steps.step_id.content}}`)
- **decline**: User explicitly rejected the request
- **cancel**: User dismissed without choosing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that elicitation pretty much kills scalability because you need to send the elicitation response to the same MCP server, so we /must/ keep vMCP a singleton in that case.


### Zero-Trust Security Model

Virtual MCP adheres to a zero-trust security architecture where compromising one layer does not automatically grant access to others:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, but again I think we should extend the architecture all the way to the back end. The primary reason is that if we want to support an interactive flow to acquire credentials - either for the back end MCP servers that might be authed by a different IDP or the back end APIs, I think that we need to do the OAuth flow on the vMCP side because that's the only one a browser will interact with. In that case (back end interactive auth) we also need to transfer the back end creds, which means sending 2 creds (vMCP and back end) in one go.

TokenMgr->>Cache: Store token with TTL
end

TokenMgr->>Backend: Forward request + Bearer token (aud=backend-api)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think the token mgr should be forwarding the tokens or just caching? Asking because we have a pending issue to wrap TokenSource in a caching layer even for the usual token exchange in the middlewares and I think it would be cool to reuse here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take all I can re-use! I'm open to changing/enhancing this.

This proposal introduces a Virtual MCP Server that aggregates multiple
MCP servers from a ToolHive group into a single unified interface.

Key features:
- Leverages existing ToolHive groups for backend management
- Uses existing ToolsFilter and ToolOverride constructs
- Supports per-backend authentication strategies
- Enables composite tools for cross-service workflows
- Maintains full MCP protocol compatibility

The Virtual MCP Server will simplify client integration by providing a
single connection point while handling the complexity of multiple backend
authentication requirements and tool namespace management.

Signed-off-by: Juan Antonio Osorio <[email protected]>
@JAORMX JAORMX force-pushed the proposal/virtual-mcp-server branch from ac1d31f to 53f8c5a Compare October 23, 2025 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants