Skip to content

fix(control-plane): inject AMBIENT_TOKEN into MCP sidecar at pod creation#1274

Merged
markturansky merged 1 commit intoalphafrom
fix/mcp-sidecar-ambient-token
Apr 10, 2026
Merged

fix(control-plane): inject AMBIENT_TOKEN into MCP sidecar at pod creation#1274
markturansky merged 1 commit intoalphafrom
fix/mcp-sidecar-ambient-token

Conversation

@markturansky
Copy link
Copy Markdown
Contributor

@markturansky markturansky commented Apr 10, 2026

Summary

  • The ambient-mcp sidecar uses the Go SDK which requires AMBIENT_TOKEN to start
  • Fetches the CP's current API token via factory.Token(ctx) at pod-creation time and injects it as AMBIENT_TOKEN into the sidecar env
  • If token fetch fails, logs a warning and starts the sidecar without it (non-fatal)

Fixes: MCP sidecar crash AMBIENT_TOKEN is required (seen in mpp-openshift runner pods)

Test plan

  • Deploy mpp-openshift overlay with this change
  • Start a session and confirm the ambient-mcp sidecar starts successfully (no AMBIENT_TOKEN is required crash)
  • Confirm MCP tools are reachable from the runner at http://localhost:8090

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Improved MCP sidecar pod resilience. Ambient token retrieval failures now trigger warning logs instead of blocking pod creation, allowing the configuration process to continue with default settings. Sidecar environment variables are now dynamically constructed based on available tokens, enabling more flexible deployment configurations.

…tion

The ambient-mcp sidecar uses the Go SDK which requires AMBIENT_TOKEN.
Fetch the CP's current API token at pod-creation time via the factory's
Token() method and inject it as AMBIENT_TOKEN into the sidecar env.

Fixes: MCP sidecar crash "AMBIENT_TOKEN is required"

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Modified the MCP sidecar provisioning logic to fetch ambient tokens from a factory and inject them as environment variables into sidecar containers, with graceful fallback on token fetch failures.

Changes

Cohort / File(s) Summary
MCP Token Integration
components/ambient-control-plane/internal/reconciler/kube_reconciler.go
Updated ensurePod to fetch MCP ambient tokens via r.factory.Token(ctx) when useMCPSidecar is true; logs warning on fetch failure and proceeds with empty token. Refactored buildMCPSidecar to accept ambientToken string parameter and conditionally populate AMBIENT_TOKEN environment variable only when token is non-empty.

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name Status Explanation Resolution
Security And Secret Handling ❌ Error PR injects control plane OIDC token directly as AMBIENT_TOKEN environment variable, exposing credentials and creating token expiration issues in long-lived pods. Use projected service account token volume with token exchange via AMBIENT_CP_TOKEN_URL endpoint for renewable, least-privilege credentials.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Kubernetes Resource Safety ⚠️ Warning Pod and Secret resources lack ownerReferences for garbage collection; pod spec missing pod-level securityContext despite container-level controls. Add ownerReferences to Pod and Secret metadata with Controller=true; add pod-level securityContext (runAsNonRoot, fsGroup) to pod spec matching operator patterns.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title follows Conventional Commits format (fix(control-plane):) and accurately describes the main change: injecting AMBIENT_TOKEN into MCP sidecar.
Performance And Algorithmic Complexity ✅ Passed Token() executes once per pod creation with early exit for existing pods; OIDC provider caches with 30-second TTL, O(1) cached lookups, no algorithmic complexity issues.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/mcp-sidecar-ambient-token
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/mcp-sidecar-ambient-token

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/ambient-control-plane/internal/reconciler/kube_reconciler.go`:
- Around line 444-449: The current code silently swallows failures from
r.factory.Token and appends an MCP sidecar with an empty ambientToken which
produces a broken pod; change the logic so token resolution is fatal: if
r.factory.Token(ctx) returns an error, propagate/return that error (or otherwise
abort pod creation) instead of setting ambientToken = "" and calling
r.buildMCPSidecar; ensure ensurePod/ensure flow does not create or mark a pod as
Running when the token is missing. Update the analogous token-fetch site (the
other occurrence around r.buildMCPSidecar) to follow the same fatal behavior so
no pod with a missing AMBIENT_TOKEN is ever created.
- Around line 444-449: The code currently calls r.factory.Token(ctx) and injects
that control-plane OIDC access token into the MCP sidecar env (ambientToken) via
buildMCPSidecar, which exposes a long-lived control-plane credential to the pod;
instead change buildMCPSidecar so it does NOT accept or embed
r.factory.Token(ctx) directly — remove the ambientToken env injection and
instead configure the sidecar to use a renewable, least-privilege credential
flow (e.g., reference a projected ServiceAccount token file or a
TokenRequest-based short-lived token, or have the sidecar call a local
control-plane token proxy endpoint); update places that call
r.factory.Token(ctx) (the TokenProvider) to stop passing the raw token into pod
spec and instead ensure tokens are obtained at runtime via the chosen
projection/proxy mechanism.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 01bda881-8d36-4578-9892-d66b7517ba2f

📥 Commits

Reviewing files that changed from the base of the PR and between 6c8f09c and bbf5b3d.

📒 Files selected for processing (1)
  • components/ambient-control-plane/internal/reconciler/kube_reconciler.go

Comment on lines +444 to +449
ambientToken, err := r.factory.Token(ctx)
if err != nil {
r.logger.Warn().Err(err).Str("session_id", session.ID).Msg("failed to fetch token for MCP sidecar; sidecar will start without AMBIENT_TOKEN")
ambientToken = ""
}
containers = append(containers, r.buildMCPSidecar(ambientToken))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Make MCP token resolution fatal here

ambient-mcp does not have a fallback mode: it exits immediately when AMBIENT_TOKEN is missing (components/ambient-mcp/main.go:19-23). So this path still creates a broken pod, and because ensurePod returns early for existing pods on Line 394, a transient token-fetch failure can leave the session permanently stuck with a dead sidecar while the reconciler still advances it to Running on Line 165.

Possible fix
 	if useMCPSidecar {
 		ambientToken, err := r.factory.Token(ctx)
 		if err != nil {
-			r.logger.Warn().Err(err).Str("session_id", session.ID).Msg("failed to fetch token for MCP sidecar; sidecar will start without AMBIENT_TOKEN")
-			ambientToken = ""
+			return fmt.Errorf("fetching token for MCP sidecar: %w", err)
 		}
 		containers = append(containers, r.buildMCPSidecar(ambientToken))
 		r.logger.Info().Str("session_id", session.ID).Msg("MCP sidecar enabled for session")
 	}

As per coding guidelines, "Flag only errors, security risks, or functionality-breaking problems."

Also applies to: 832-834

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/ambient-control-plane/internal/reconciler/kube_reconciler.go`
around lines 444 - 449, The current code silently swallows failures from
r.factory.Token and appends an MCP sidecar with an empty ambientToken which
produces a broken pod; change the logic so token resolution is fatal: if
r.factory.Token(ctx) returns an error, propagate/return that error (or otherwise
abort pod creation) instead of setting ambientToken = "" and calling
r.buildMCPSidecar; ensure ensurePod/ensure flow does not create or mark a pod as
Running when the token is missing. Update the analogous token-fetch site (the
other occurrence around r.buildMCPSidecar) to follow the same fatal behavior so
no pod with a missing AMBIENT_TOKEN is ever created.

⚠️ Potential issue | 🟠 Major

Do not inject the control-plane bearer token directly into the pod

r.factory.Token(ctx) is returning the control plane's current OIDC access token (components/ambient-control-plane/internal/auth/token_provider.go:51-69). Putting that raw token into the sidecar env means: 1) long-lived sessions will lose MCP API access when the token expires, and 2) the control-plane credential is now exposed anywhere this pod spec or container env is readable. This needs a renewable, least-privilege credential flow instead of copying the control plane's current token into workload pods.

As per coding guidelines, "Flag only errors, security risks, or functionality-breaking problems."

Also applies to: 825-846

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/ambient-control-plane/internal/reconciler/kube_reconciler.go`
around lines 444 - 449, The code currently calls r.factory.Token(ctx) and
injects that control-plane OIDC access token into the MCP sidecar env
(ambientToken) via buildMCPSidecar, which exposes a long-lived control-plane
credential to the pod; instead change buildMCPSidecar so it does NOT accept or
embed r.factory.Token(ctx) directly — remove the ambientToken env injection and
instead configure the sidecar to use a renewable, least-privilege credential
flow (e.g., reference a projected ServiceAccount token file or a
TokenRequest-based short-lived token, or have the sidecar call a local
control-plane token proxy endpoint); update places that call
r.factory.Token(ctx) (the TokenProvider) to stop passing the raw token into pod
spec and instead ensure tokens are obtained at runtime via the chosen
projection/proxy mechanism.

@markturansky markturansky merged commit ad938e2 into alpha Apr 10, 2026
36 checks passed
@markturansky markturansky deleted the fix/mcp-sidecar-ambient-token branch April 10, 2026 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant