feat(control-plane): replace TokenReview with RSA keypair auth for runner token endpoint#1216
feat(control-plane): replace TokenReview with RSA keypair auth for runner token endpoint#1216markturansky merged 3 commits intoalphafrom
Conversation
…nner token endpoint CP bootstraps an RSA-4096 keypair Secret in its namespace on startup (via project kube client) if one does not exist. The private key decrypts bearer tokens sent by runners; the public key is injected into runner Job pods as AMBIENT_CP_TOKEN_PUBLIC_KEY. Runners encrypt their SESSION_ID with the public key (RSA-OAEP/SHA-256) and send the base64 ciphertext as the Authorization header. CP decrypts it to verify the caller is a legitimate runner without needing TokenReview cluster permissions. Survives CP restarts: keypair persists in the Kubernetes Secret. Future path: replace Secret with Vault-backed ExternalSecret, no code change needed. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughControl plane now bootstraps an RSA keypair stored as a Kubernetes Secret, provides the public key to runners via env, and the token server decrypts RSA-OAEP-encrypted bearer tokens from runners to extract session IDs and mint tokens. Changes
Sequence Diagram(s)sequenceDiagram
participant CP as Control Plane
participant K8s as Kubernetes API
participant Runner as Ambient Runner
participant TokenServer as Token Server
Note over CP: Startup/bootstrap
CP->>K8s: EnsureKeypairSecret(namespace)
K8s-->>CP: Return KeyPair (private.pem, public.pem)
CP->>Runner: Set AMBIENT_CP_TOKEN_PUBLIC_KEY env var (public.pem)
Note over Runner,TokenServer: Token exchange
Runner->>Runner: Load SESSION_ID
Runner->>Runner: Encrypt SESSION_ID with CP public key (RSA-OAEP SHA-256)
Runner->>Runner: Base64-encode ciphertext
Runner->>TokenServer: POST /token with Authorization: Bearer <ciphertext>
rect rgba(200,150,100,0.5)
TokenServer->>TokenServer: Base64-decode bearer token
TokenServer->>TokenServer: Decrypt with private key (RSA-OAEP)
TokenServer->>TokenServer: Validate session ID format
end
alt valid session ID
TokenServer->>TokenServer: Mint access token
TokenServer-->>Runner: 200 OK + token
else decryption failure
TokenServer-->>Runner: 401 Unauthorized
else invalid session ID
TokenServer-->>Runner: 403 Forbidden
end
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/ambient-control-plane/internal/keypair/bootstrap.go`:
- Around line 81-83: In keypairFromSecret, don't ignore the error returned by
unstructured.NestedMap(secret.Object, "data"); instead capture both the map and
the error (and the existence bool), validate that the call succeeded and return
a descriptive error if it failed or if the "data" field is missing/invalid so
downstream lookups on `data` don't panic—update keypairFromSecret to check the
returned error and existence flag and propagate a clear error when malformed
Secret data is detected.
In `@components/runners/ambient-runner/ambient_runner/_grpc_client.py`:
- Around line 37-48: The _encrypt_session_id function currently calls
serialization.load_pem_public_key(public_key_pem.encode()) without handling
malformed PEMs; wrap the load_pem_public_key call (and the subsequent encrypt)
in a try/except that catches ValueError (and optionally TypeError) raised for
invalid/malformed PEMs and raise or return a clear, domain-specific error
message (e.g., raise ValueError("Invalid CP public key PEM in
_encrypt_session_id: ...") or log and rethrow) so callers get a descriptive
failure instead of an unhandled exception; keep the rest of the RSA-OAEP flow
(padding, hashes, base64 encoding) unchanged and reference _encrypt_session_id
and the public_key_pem parameter when updating error text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5ff26ea5-5556-4eb7-bff4-6b8048688778
⛔ Files ignored due to path filters (1)
components/runners/ambient-runner/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
components/ambient-control-plane/cmd/ambient-control-plane/main.gocomponents/ambient-control-plane/internal/keypair/bootstrap.gocomponents/ambient-control-plane/internal/reconciler/kube_reconciler.gocomponents/ambient-control-plane/internal/tokenserver/handler.gocomponents/ambient-control-plane/internal/tokenserver/server.gocomponents/manifests/overlays/mpp-openshift/ambient-cp-token-netpol.yamlcomponents/manifests/overlays/mpp-openshift/kustomization.yamlcomponents/runners/ambient-runner/ambient_runner/_grpc_client.pycomponents/runners/ambient-runner/pyproject.toml
| func keypairFromSecret(secret *unstructured.Unstructured) (*KeyPair, error) { | ||
| data, _, _ := unstructured.NestedMap(secret.Object, "data") | ||
|
|
There was a problem hiding this comment.
Silently ignoring NestedMap error may mask corrupted Secret data.
If the Secret exists but has a malformed structure, the error is discarded and the code proceeds to fail confusingly on subsequent key lookups.
Proposed fix
func keypairFromSecret(secret *unstructured.Unstructured) (*KeyPair, error) {
- data, _, _ := unstructured.NestedMap(secret.Object, "data")
+ data, found, err := unstructured.NestedMap(secret.Object, "data")
+ if err != nil || !found {
+ return nil, fmt.Errorf("keypair secret has invalid or missing data field")
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func keypairFromSecret(secret *unstructured.Unstructured) (*KeyPair, error) { | |
| data, _, _ := unstructured.NestedMap(secret.Object, "data") | |
| func keypairFromSecret(secret *unstructured.Unstructured) (*KeyPair, error) { | |
| data, found, err := unstructured.NestedMap(secret.Object, "data") | |
| if err != nil || !found { | |
| return nil, fmt.Errorf("keypair secret has invalid or missing data field") | |
| } | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/ambient-control-plane/internal/keypair/bootstrap.go` around lines
81 - 83, In keypairFromSecret, don't ignore the error returned by
unstructured.NestedMap(secret.Object, "data"); instead capture both the map and
the error (and the existence bool), validate that the call succeeded and return
a descriptive error if it failed or if the "data" field is missing/invalid so
downstream lookups on `data` don't panic—update keypairFromSecret to check the
returned error and existence flag and propagate a clear error when malformed
Secret data is detected.
| def _encrypt_session_id(public_key_pem: str, session_id: str) -> str: | ||
| """RSA-OAEP encrypt session_id with the CP public key, return base64-encoded ciphertext.""" | ||
| public_key = serialization.load_pem_public_key(public_key_pem.encode()) | ||
| ciphertext = public_key.encrypt( | ||
| session_id.encode(), | ||
| padding.OAEP( | ||
| mgf=padding.MGF1(algorithm=hashes.SHA256()), | ||
| algorithm=hashes.SHA256(), | ||
| label=None, | ||
| ), | ||
| ) | ||
| return base64.b64encode(ciphertext).decode() |
There was a problem hiding this comment.
Missing error handling for invalid public key PEM.
load_pem_public_key raises ValueError if the PEM is malformed. This would surface as an unhandled exception rather than a clear error message.
Proposed fix
def _encrypt_session_id(public_key_pem: str, session_id: str) -> str:
"""RSA-OAEP encrypt session_id with the CP public key, return base64-encoded ciphertext."""
- public_key = serialization.load_pem_public_key(public_key_pem.encode())
+ try:
+ public_key = serialization.load_pem_public_key(public_key_pem.encode())
+ except (ValueError, TypeError) as e:
+ raise RuntimeError(f"invalid CP token public key: {e}") from e
ciphertext = public_key.encrypt(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _encrypt_session_id(public_key_pem: str, session_id: str) -> str: | |
| """RSA-OAEP encrypt session_id with the CP public key, return base64-encoded ciphertext.""" | |
| public_key = serialization.load_pem_public_key(public_key_pem.encode()) | |
| ciphertext = public_key.encrypt( | |
| session_id.encode(), | |
| padding.OAEP( | |
| mgf=padding.MGF1(algorithm=hashes.SHA256()), | |
| algorithm=hashes.SHA256(), | |
| label=None, | |
| ), | |
| ) | |
| return base64.b64encode(ciphertext).decode() | |
| def _encrypt_session_id(public_key_pem: str, session_id: str) -> str: | |
| """RSA-OAEP encrypt session_id with the CP public key, return base64-encoded ciphertext.""" | |
| try: | |
| public_key = serialization.load_pem_public_key(public_key_pem.encode()) | |
| except (ValueError, TypeError) as e: | |
| raise RuntimeError(f"invalid CP token public key: {e}") from e | |
| ciphertext = public_key.encrypt( | |
| session_id.encode(), | |
| padding.OAEP( | |
| mgf=padding.MGF1(algorithm=hashes.SHA256()), | |
| algorithm=hashes.SHA256(), | |
| label=None, | |
| ), | |
| ) | |
| return base64.b64encode(ciphertext).decode() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/runners/ambient-runner/ambient_runner/_grpc_client.py` around
lines 37 - 48, The _encrypt_session_id function currently calls
serialization.load_pem_public_key(public_key_pem.encode()) without handling
malformed PEMs; wrap the load_pem_public_key call (and the subsequent encrypt)
in a try/except that catches ValueError (and optionally TypeError) raised for
invalid/malformed PEMs and raise or return a clear, domain-specific error
message (e.g., raise ValueError("Invalid CP public key PEM in
_encrypt_session_id: ...") or log and rethrow) so callers get a descriptive
failure instead of an unhandled exception; keep the rest of the RSA-OAEP flow
(padding, hashes, base64 encoding) unchanged and reference _encrypt_session_id
and the public_key_pem parameter when updating error text.
… and runner encryption - keypair/bootstrap_test.go: generateKeypair, ParsePrivateKey, keypairFromSecret (valid/missing keys), EnsureKeypairSecret (creates on missing, reuses existing) - tokenserver/handler_test.go: success path, missing/wrong auth header, invalid base64, wrong RSA key, method not allowed, isValidSessionID table, decrypt round-trip - tests/test_grpc_client.py: URL validation, RSA-OAEP encrypt/decrypt round-trip, ciphertext randomness, fetch retries, HTTP error body capture, missing token field, from_env integration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
ambient-cp-token-keypair) in its namespace on startup via the project kube client; generates if missingAMBIENT_CP_TOKEN_PUBLIC_KEYinto all runner Job podsSESSION_IDwith the public key, send base64 ciphertext asAuthorization: BearerTokenReviewcluster permission requiredMotivation
The CP SA does not have (and cannot be granted via tenant operator) cluster-scoped
create tokenreviewspermission. The previousTokenReview-based validation returned 401 for all runners.Test plan
ambient-cp-token-keypairSecret created in CP namespace on first bootCP token endpoint unreachableerroracpctl session events $idstreams without 502🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Refactor
Chores
Tests