🤖 feat: CoderProvisioner drift detection, status conditions, and test coverage#46
🤖 feat: CoderProvisioner drift detection, status conditions, and test coverage#46ThomasK33 wants to merge 38 commits into
Conversation
…ecret loss Addresses Codex review feedback: - Deletion finalizer now gracefully handles missing CoderControlPlane or bootstrap credentials (common during namespace teardown). Logs a warning and proceeds to finalizer removal instead of blocking. - When the provisioner key secret is deleted but the key already exists in coderd, the controller now rotates the key (delete + recreate) to obtain fresh plaintext material for secret recovery. - Added test for best-effort deletion when control plane is deleted first.
…ble control plane Addresses additional Codex review feedback: - Finalizer cleanup now uses the key name from status (reflecting what was actually created in coderd) rather than the current spec value, preventing orphaned keys when spec.key.name is edited after creation. - All fetchControlPlane errors during deletion are treated as non-blocking (not just NotFound), handling the case where the control plane exists but has an empty status.url.
Addresses Codex round 3 feedback: - All token-read errors during deletion are now best-effort (not just NotFound). Handles wrong key names, empty values, etc. - Organization identity for finalizer cleanup uses status.OrganizationID (which may contain the resolved UUID) before falling back to spec.
Codex round 4: - DeleteProvisionerKey errors during deletion are now logged and skipped (best-effort), preventing transient API failures from blocking finalizer removal. - Secret existence check now also validates the configured data key is present and non-empty. A secret with a missing or empty key triggers key rotation/recovery instead of silently proceeding.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3849035117
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed both review findings:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d3c6c313c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 310f277580
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed both findings:
|
|
@codex review Addressed both findings:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6cb4149e9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Both findings addressed:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2132a72f17
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Fixed: delete failure in metadata-backfill now returns an error + |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c747d9077
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Fixed: removed |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 24a291cfff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Both findings are pre-existing design limitations, not issues introduced by this PR:
Both items are worth implementing but are scope expansions beyond this PR's goal of drift detection, conditions, and test coverage. The changes in this PR are all net-improvements to the controller's robustness. Pausing the review loop for human direction — Codex keeps surfacing valid pre-existing gaps rather than issues with the implemented changes. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 24a291cfff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed both:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3591763491
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Both addressed:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7517bc0c43
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Both addressed:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dc0144f219
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Fixed: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c6151aaab5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "ProvisionerKeyReady", | ||
| "Provisioner key is available in coderd", | ||
| ) | ||
| } else if status.OrganizationName == "" || status.TagsHash == "" || status.ControlPlaneRefName == "" { |
There was a problem hiding this comment.
Backfill missing control-plane URL metadata
status.ControlPlaneURL is consumed later for URL drift detection and deletion cleanup, but this metadata-backfill branch only runs when org/tags/control-plane-ref fields are empty. If a provisioner has those fields populated while status.controlPlaneURL is empty (e.g., after upgrading from an intermediate revision or a status patch), this branch is skipped, so the URL never gets restored; that disables controlPlaneURLDrift and forces deletion cleanup to depend on fetching a live CoderControlPlane, which can skip key deletion when the control plane object is already gone.
Useful? React with 👍 / 👎.
Summary
Implements the three main follow-ups from the CoderProvisioner deep review: spec drift detection with automatic key rotation, meaningful status conditions, and expanded test coverage.
Background
The CoderProvisioner controller successfully deploys external provisioner daemon Pods and manages a provisioner key in coderd. A deep review identified three functional gaps:
kubectl describe.Implementation
Drift Detection + Rotation
OrganizationNameandTagsHashfields toCoderProvisionerStatusto record what was last applied to coderd.hashProvisionerTags()helper using sorted keys + FNV-1a.Status Conditions
ControlPlaneReady,BootstrapSecretReady,ProvisionerKeyReady,ProvisionerKeySecretReady,DeploymentReady.ControlPlaneReady=Falsewhen the control plane is unreachable).reconcileStatusto update fields individually instead of wholesale replacement, preserving conditions set earlier in the reconcile loop.Test Coverage
Other Improvements
MaxLength) on user-provided identifier fields.kubectl getprint columns (Phase, Replicas, Age)./coder-k8sto.gitignore.Validation
make verify-vendor✅make lint✅make build✅make test✅make test-integration✅make docs-reference-check✅Risks
📋 Implementation Plan
Plan: CoderProvisioner deep-review follow-ups (drift, conditions, tests)
Context / Why
The current
CoderProvisionerCRD + controller successfully deploys external provisioner daemon Pods and manages a provisioner key incoderd. A deep review found three functional gaps we want to close:kubectl describe.This plan implements drift detection (option a), adds meaningful conditions, expands test coverage, and ensures docs/samples do not leak secrets.
Evidence (review surface)
internal/controller/coderprovisioner_controller.goEnsureProvisionerKeywhen the Secret is missing/empty (if !secretUsable { ... }).reconcileStatusreplaces status wholesale, wipingConditions.internal/controller/coderprovisioner_controller_test.gointernal/coderbootstrap/provisionerkeys.goEnsureProvisionerKeyreturns plaintext key only on create; existing keys return metadata withKey == "".api/v1alpha1/coderprovisioner_types.goStatus.Conditionsbut nothing sets them today.Implementation details
1) Drift detection + rotation (option a)
Files:
api/v1alpha1/coderprovisioner_types.gointernal/controller/coderprovisioner_controller.go1.1 Add status fields to record what we last applied to coderd
Add to
CoderProvisionerStatus:OrganizationName string— the organization name used (defaulted todefault).TagsHash string— deterministic hash ofspec.tagslast applied to the provisioner key.Rationale: we need a persisted “previous desired state” to detect drift across reconciles.
1.2 Add deterministic tag hashing
In the controller, add a helper:
1.3 Rework the key/secret decision tree
In
Reconcile, compute “desired” values and compare to the “applied” status:desiredOrgName := provisionerOrganizationName(spec.organizationName)desiredKeyName, desiredSecretName, desiredSecretKey := provisionerKeyConfig(...)desiredTagsHash := hashProvisionerTags(spec.tags)Compute drift:
orgDrift := status.OrganizationName != "" && status.OrganizationName != desiredOrgNamekeyNameDrift := status.ProvisionerKeyName != "" && status.ProvisionerKeyName != desiredKeyNametagsDrift := status.TagsHash != "" && status.TagsHash != desiredTagsHashCompute other needs:
secretUsable := Secret exists AND has non-empty data at desiredSecretKey(current logic)metadataMissing := status.OrganizationID == "" || status.ProvisionerKeyID == "" || status.ProvisionerKeyName == ""Then:
orgDrift || keyNameDrift || tagsDrift: rotate (delete old key, create new key with desired tags), then update Secret.!secretUsable: ensure key; if response has emptyKey, rotate to recover plaintext.metadataMissing: callEnsureProvisionerKeyto populate IDs/name (Key may be empty, that’s OK).Rotation rules:
status.OrganizationName+status.ProvisionerKeyName(best-effort delete, but treat failure as fatal if it prevents create).EnsureProvisionerKey(desired...)and require non-empty plaintext key to update the Secret.1.4 Avoid unnecessary rotation on SecretRef changes
If
status.SecretRefexists andspec.key.secretNameorspec.key.secretKeychanges:This prevents needless disruption when users only rename the Secret or change the data key.
2) Populate and preserve Status.Conditions
Files:
api/v1alpha1/coderprovisioner_types.gointernal/controller/coderprovisioner_controller.go2.1 Add condition type constants
Add condition constants (API package) such as:
CoderProvisionerConditionControlPlaneReadyCoderProvisionerConditionBootstrapSecretReadyCoderProvisionerConditionProvisionerKeyReadyCoderProvisionerConditionProvisionerKeySecretReadyCoderProvisionerConditionDeploymentReady2.2 Update controller to set conditions (and not wipe them)
k8s.io/apimachinery/pkg/api/metaand usemeta.SetStatusCondition(&provisioner.Status.Conditions, cond).reconcileStatusto start from existing status, update fields, and preserve conditions.2.3 Best-effort condition updates on failure paths
For common failures (missing control plane URL, missing bootstrap secret/token, coderd API errors):
Falsewith a safe message.Status().Updatebest-effort (do not override the returned reconcile error).Security constraint: condition messages must never include session token or provisioner key material; only refer to secret names/keys and high-level errors.
3) Tests: drift/rotation/conditions + truncation
Files:
internal/controller/coderprovisioner_controller_test.gointernal/controller/workspaceproxy_controller_test.go(fake bootstrap client)3.1 Upgrade the fake bootstrap client to support sequences
Replace static
provisionerKeyResponsewith either:[]EnsureProvisionerKeyResponse) consumed per call, orEnsureProvisionerKeyFn(callN, req)) for complex scenarios.This is required to test “empty key → delete → recreate” flows.
3.2 Add envtest cases
Add tests to cover:
EnsureProvisionerKeyreturnsKey:"", verifyDeleteProvisionerKeythen second ensure returns plaintext.spec.tags, reconcile, verify delete+ensure and Secret updated.spec.key.name, reconcile, verify old key deleted and new key created.expectedProvisionerServiceAccountNamehelper to match real truncation logic).deployment.Status.ReadyReplicas=1and verifyPhase=ReadyandDeploymentReady=True.BootstrapSecretReady=False.4) Docs & samples: ensure no secret leakage
Files:
config/samples/coder_v1alpha1_coderprovisioner.yaml(optional) a short how-to doc under
docs/how-to/Keep samples using
credentialsSecretRefonly (no literal tokens/keys).Add commented, placeholder-only instructions for creating the bootstrap secret (e.g.,
kubectl create secret ... --from-literal=token=<...>).Re-run
make docs-referenceafter API/status changes and confirm generated docs contain no secret values.5) Other deep-review items (judgment)
Implement now (low effort, improves robustness):
CoderProvisionerStatus(parity with other types).MaxLength,Pattern) for user-provided identifier fields that become K8s names / Coder key identifiers.reconcileStatusdoes not accidentally erase conditions (covered above).Defer (nice-to-have):
DeleteProvisionerKeyto take a request struct for consistency with other bootstrap methods.Validation
Run after implementation:
make codegenmake manifestsmake docs-reference-checkmake lintmake testmake test-integrationmake buildGenerated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$19.36