Skip to content
This repository was archived by the owner on Jan 30, 2026. It is now read-only.

[HYPERFLEET-273]: Implement MVP pull-secret job for GCP Secret Manager with Helm deployment#2

Merged
ciaranRoche merged 11 commits intoopenshift-hyperfleet:mainfrom
ldornele:HYPERFLEET-273
Jan 8, 2026
Merged

[HYPERFLEET-273]: Implement MVP pull-secret job for GCP Secret Manager with Helm deployment#2
ciaranRoche merged 11 commits intoopenshift-hyperfleet:mainfrom
ldornele:HYPERFLEET-273

Conversation

@ldornele
Copy link
Copy Markdown
Contributor

@ldornele ldornele commented Dec 16, 2025

This PR sets up Workload Identity and creates the service account needed to enable the job to store pull secrets in GCP Secret Manager. HYPERFLEET-273

Summary by CodeRabbit

  • New Features

    • Added a pull-secret job and CLI to create/update Docker pull secrets in Google Cloud Secret Manager with validation and retry logic.
    • Helm chart and Kubernetes manifests to deploy the pull-secret adapter.
  • Documentation

    • Helm chart README with install/config/usage and troubleshooting.
    • Comprehensive test documentation for the pull-secret workflow.
  • Tests

    • Extensive unit tests covering validation, retry behavior, task creation, and logging.
  • Chores

    • Enabled govet shadowing checks in linting.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 16, 2025

Walkthrough

This PR adds a pull-secret job runner and CLI for Google Cloud Secret Manager. It introduces a PullSecretJob and PullSecretTask implementing task discovery and processing (env-driven config, validation, dry-run, create-or-update secret, add version, verify with retries). A Cobra-based entrypoint wires a JobCommand. Extensive unit tests, Helm chart and templates for Kubernetes deployment, a README for tests, and enabling govet shadow checks were also added.

Sequence Diagram

sequenceDiagram
    autonumber
    participant User
    participant CLI as pull-secret CLI
    participant Registry as Job Registry
    participant Job as PullSecretJob
    participant Task as PullSecretTask
    participant SM as GCP Secret Manager

    User->>CLI: run pull-secret (env vars set)
    CLI->>Registry: NewJobCommand(ctx) / register jobs
    Registry->>Job: GetTasks()
    Job->>Job: read env (GCP_PROJECT_ID, CLUSTER_ID, SECRET_NAME, PULL_SECRET_DATA)
    Job->>Task: emit PullSecretTask
    Registry-->>CLI: tasks list
    CLI->>Task: Process(ctx)
    Task->>Task: validateConfig & validatePullSecret
    alt Dry-run
        Task->>Task: log structured result (dry-run)
    else Normal flow
        Task->>SM: init client
        Task->>SM: secretExists
        alt exists
            Task->>SM: addSecretVersion
        else not exists
            Task->>SM: createSecret
            Task->>SM: addSecretVersion
        end
        loop verify with backoff
            Task->>SM: verifySecret
            SM-->>Task: status
        end
        Task->>Task: log structured result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • ciaranRoche
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title accurately describes the main changeset: implementing a pull-secret job for GCP Secret Manager and adding Helm deployment configuration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
k8s/serviceaccount.yaml (1)

5-5: Consider using a dedicated namespace instead of default.

For production deployments, it's recommended to use a dedicated namespace (e.g., hyperfleet-system as mentioned in the README production considerations). This provides better isolation and resource management.

-  namespace: default
+  namespace: hyperfleet-system

Note: If you change the namespace, ensure consistency across all manifests (job.yaml) and update the documentation accordingly.

k8s/deploy.sh (3)

19-20: Replace fixed sleep with kubectl wait for better reliability.

Using fixed sleep durations can be fragile—ServiceAccount creation might take longer or shorter depending on cluster load. Consider using kubectl wait for deterministic readiness checks.

-# Wait a bit for SA to be created
-sleep 2
+# Wait for ServiceAccount to be created
+kubectl wait --for=jsonpath='{.metadata.name}'=pullsecret-adapter sa/pullsecret-adapter --timeout=30s

28-29: Replace fixed sleep with kubectl wait for job readiness.

Similar to the ServiceAccount creation, use kubectl wait to ensure the job has started before querying its status.

-# Wait for job to start
-echo "Waiting for job to start..."
-sleep 3
+# Wait for job to be created
+echo "Waiting for job to be ready..."
+kubectl wait --for=condition=ready pod -l app=pullsecret-adapter --timeout=60s || true

41-42: Add error handling for missing pods.

If the pod hasn't been created yet or fails to start, this command will fail. Add validation to handle cases where no pods exist.

 # Get pod name
-POD_NAME=$(kubectl get pods -l app=pullsecret-adapter -o jsonpath='{.items[0].metadata.name}')
+POD_NAME=$(kubectl get pods -l app=pullsecret-adapter -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || echo "")
+
+if [ -z "$POD_NAME" ]; then
+  echo "Warning: No pod found yet. Use 'kubectl get pods -l app=pullsecret-adapter' to check status."
+  POD_NAME="<pod-name>"
+fi
k8s/job.yaml (1)

5-5: Consider using a dedicated namespace.

Consistent with the ServiceAccount, consider moving this Job to a dedicated namespace (e.g., hyperfleet-system) for better isolation and alignment with production best practices mentioned in the README.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbf7c08 and e4cb82f.

📒 Files selected for processing (4)
  • k8s/README.md (1 hunks)
  • k8s/deploy.sh (1 hunks)
  • k8s/job.yaml (1 hunks)
  • k8s/serviceaccount.yaml (1 hunks)
🔇 Additional comments (7)
k8s/serviceaccount.yaml (1)

1-7: Workload Identity configuration looks correct.

The ServiceAccount is properly annotated for GCP Workload Identity binding. This will allow pods using this ServiceAccount to authenticate to GCP services without requiring service account keys.

k8s/README.md (1)

1-163: Documentation is comprehensive and well-structured.

The README provides clear guidance on prerequisites, deployment options, monitoring, troubleshooting, and production considerations. The structure and content will help users successfully deploy and manage the Pull Secret Job.

k8s/deploy.sh (1)

1-11: Script setup and cluster validation look good.

The use of set -e ensures the script fails fast on errors, and the cluster connectivity check provides early validation before attempting deployments.

k8s/job.yaml (4)

21-21: Dev image tag in use.

The image uses a dev build tag (dev-f1bf914). Ensure this is intentional for testing. For production deployments, use a stable, versioned tag.


39-46: Excellent security context configuration.

The pod security settings follow best practices: running as non-root, using a read-only root filesystem, disabling privilege escalation, and dropping all capabilities. This minimizes the attack surface.


10-11: Retry and TTL configuration is appropriate.

The backoffLimit: 3 provides reasonable retry attempts, and ttlSecondsAfterFinished: 3600 keeps completed jobs for one hour, which is useful for debugging while preventing indefinite resource accumulation.


32-38: Resource limits are well-defined.

The CPU and memory requests/limits are appropriate for a lightweight job that interacts with GCP Secret Manager. The 100m CPU request is efficient, and the 512Mi memory limit provides a reasonable buffer.

k8s/job.yaml Outdated
Comment on lines +30 to +31
- name: PULL_SECRET_DATA
value: '{"auths":{"registry.redhat.io":{"auth":"dGVzdDp0ZXN0","email":"test@example.com"}}}'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Move sensitive credentials to Kubernetes Secrets.

Hardcoding sensitive data like PULL_SECRET_DATA directly in the manifest is a security risk, even for test credentials. The base64-encoded auth value and email address should be stored in a Kubernetes Secret and referenced via valueFrom.

Create a Secret first:

apiVersion: v1
kind: Secret
metadata:
  name: pullsecret-data
  namespace: default
type: Opaque
stringData:
  pull-secret-data: '{"auths":{"registry.redhat.io":{"auth":"dGVzdDp0ZXN0","email":"test@example.com"}}}'

Then reference it in the Job:

         - name: PULL_SECRET_DATA
-          value: '{"auths":{"registry.redhat.io":{"auth":"dGVzdDp0ZXN0","email":"test@example.com"}}}'
+          valueFrom:
+            secretKeyRef:
+              name: pullsecret-data
+              key: pull-secret-data
🤖 Prompt for AI Agents
In k8s/job.yaml around lines 30-31, the PULL_SECRET_DATA env var currently
contains hardcoded sensitive JSON; create a Kubernetes Secret (e.g., name:
pullsecret-data, key: pull-secret-data) containing that JSON and remove the
inline value, then update the Job's env entry to use valueFrom.secretKeyRef
referencing the secret name and key so the secret is mounted from the cluster
rather than hardcoded in the manifest.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/ci.yml (1)

61-62: Remove incorrect working-directory setting from the build and test jobs.

The workflow specifies working-directory: adapter-pull-secret in defaults (lines 39-41 and 66-68), but this directory does not exist in the repository. The Makefile and all source files are at the repository root. The workflow will fail when attempting to execute commands from a non-existent directory. Remove the working-directory setting or change it to . if explicitly needed.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4cb82f and 317565b.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (5 hunks)

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
cmd/pull-secret/jobs/pull_secret.go (3)

26-28: Remove unused variable.

PullSecretJobArgs is declared but never used in the codebase.

Apply this diff:

-var PullSecretJobArgs struct {
-	name string
-}
-
 type PullSecretJob struct {

111-116: Consider making retry count configurable.

The retry count is hardcoded to 3 in multiple places (lines 113, 121). Consider making this configurable via an environment variable or constant for operational flexibility.

Example:

+const maxRetries = 3
+
 func (e PullSecretTask) Process(ctx context.Context) error {
 	// ... validation code ...
 	
 	// Create or update secret with retry logic
-	if err := retryWithBackoff(ctx, func() error {
+	if err := retryWithBackoff(ctx, func() error {
 		return e.createOrUpdateSecret(ctx, client)
-	}, 3); err != nil {
+	}, maxRetries); err != nil {

304-307: Jitter implementation always adds instead of randomizing.

The jitter calculation always adds 20% to the backoff, rather than adding a random value within ±20%. This makes the backoff predictable rather than randomized, which reduces its effectiveness in preventing thundering herd scenarios.

Consider using true randomization:

+import "math/rand"
+
 // Calculate backoff with jitter (±20%)
 baseBackoff := time.Duration(1<<uint(i)) * time.Second
-jitter := time.Duration(float64(baseBackoff) * 0.2)
-backoff := baseBackoff + jitter
+jitterRange := float64(baseBackoff) * 0.2
+jitter := time.Duration((rand.Float64() * 2 - 1) * jitterRange)
+backoff := baseBackoff + jitter
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 317565b and 0c1b4de.

📒 Files selected for processing (5)
  • cmd/pull-secret/jobs/README_TESTS.md (1 hunks)
  • cmd/pull-secret/jobs/cmd.go (1 hunks)
  • cmd/pull-secret/jobs/pull_secret.go (1 hunks)
  • cmd/pull-secret/jobs/pull_secret_test.go (1 hunks)
  • cmd/pull-secret/main.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
cmd/pull-secret/main.go (1)
cmd/pull-secret/jobs/cmd.go (1)
  • NewJobCommand (10-23)
cmd/pull-secret/jobs/cmd.go (2)
pkg/job/job.go (2)
  • NewJobRegistry (125-127)
  • CommandBuilder (36-48)
cmd/pull-secret/jobs/pull_secret.go (1)
  • PullSecretJob (30-31)
cmd/pull-secret/jobs/pull_secret_test.go (1)
cmd/pull-secret/jobs/pull_secret.go (2)
  • PullSecretTask (19-24)
  • PullSecretJob (30-31)
cmd/pull-secret/jobs/pull_secret.go (1)
pkg/job/job.go (2)
  • Task (137-140)
  • Metadata (17-20)
🪛 Gitleaks (8.30.0)
cmd/pull-secret/jobs/pull_secret.go

[high] 49-49: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 49-49: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 49-49: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 49-49: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🪛 LanguageTool
cmd/pull-secret/jobs/README_TESTS.md

[uncategorized] ~158-~158: The official name of this software platform is spelled with a capital “H”.
Context: ... - Test check with coverage report See .github/workflows/ci.yml for CI configuration....

(GITHUB)

🔇 Additional comments (14)
cmd/pull-secret/jobs/pull_secret.go (11)

47-50: Clarify the fake pull secret is for testing only.

The static analysis tool flagged line 49 as containing potential secrets. While this is clearly fake test data (as indicated by the comment), consider adding a more explicit marker or moving this to a test fixture to avoid confusion.

This is a false positive from Gitleaks, but it may confuse security scanners. The base64-decoded value is literally "fakeuser:fakepassword".


67-78: LGTM!

The metadata, flags, and worker count methods are correctly implemented. Single-worker execution is appropriate for this secret management job to avoid race conditions.


131-146: LGTM!

The configuration validation is thorough and provides clear error messages for each missing required field.


148-181: LGTM!

The create-or-update logic correctly handles both new secret creation and version addition. Timing instrumentation provides good observability.


183-200: LGTM!

The secret existence check correctly distinguishes between "not found" (returns false) and actual errors, which is the right pattern for this use case.


202-229: LGTM!

The secret creation includes comprehensive labels for resource management and uses automatic replication, which is appropriate for high availability.


231-248: LGTM!

The secret version addition is correctly implemented and returns the version name for audit logging.


250-268: LGTM!

The verification step ensures the secret is accessible and logs useful metrics (timing and payload size) for observability.


270-288: LGTM!

The pull secret validation correctly checks JSON structure and ensures the required "auths" key is present with valid content.


321-328: LGTM!

The retry logic correctly identifies transient gRPC errors that are appropriate to retry.


330-356: LGTM!

The structured logging implementation provides consistent JSON output with all relevant context fields. Output to stdout is appropriate for containerized environments.

cmd/pull-secret/jobs/cmd.go (1)

10-23: LGTM!

The job command initialization correctly registers the PullSecretJob and wires it into the Cobra command structure.

cmd/pull-secret/main.go (1)

1-52: LGTM!

The main entry point is well-structured with:

  • Proper context initialization
  • Container-aware GOMAXPROCS configuration
  • glog compatibility handling (with explanatory comments)
  • Clean error handling and logging
  • Correct Cobra command wiring
cmd/pull-secret/jobs/pull_secret_test.go (1)

1-533: LGTM!

The test suite is comprehensive and well-structured:

  • Table-driven tests for all unit-testable functions
  • Proper environment variable management with cleanup
  • Context cancellation testing
  • Retry logic validation across error codes
  • Good coverage of edge cases

The acknowledged gap in GCP Secret Manager integration tests is reasonable given the need for mocking infrastructure.

}

func (e PullSecretTask) TaskName() string {
return "pull-secret-mvp"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : We should create a constant for this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


// Use fake pull secret for testing if PULL_SECRET_DATA is not provided
if pullSecretData == "" {
pullSecretData = `{"auths":{"cloud.openshift.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"quay.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.connect.redhat.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.redhat.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"}}}`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here : we should keep this as a constant. This can help if we decide to enhance this code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

}
}

func (pullsecretJob *PullSecretJob) AddFlags(flags *pflag.FlagSet) {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a dry run flag here. Can help validate the authentication locally without needing to upload/delete the secret every time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
cmd/pull-secret/jobs/pull_secret.go (3)

23-23: Acknowledge the static analysis false positive.

The fake credentials are clearly documented, but Gitleaks flagged them as potential API keys. Consider adding an inline ignore comment to silence the warning and prevent confusion.

🔎 Proposed fix
 	// defaultPullSecretData is a fake pull secret used for testing when PULL_SECRET_DATA is not provided
+	// gitleaks:allow
 	defaultPullSecretData = `{"auths":{"cloud.openshift.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"quay.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.connect.redhat.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.redhat.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"}}}`
 )

135-145: Consider extracting retry count to a constant.

The maximum retry count of 3 is hardcoded in two places (lines 137 and 145). Extracting this to a package-level constant would improve maintainability and make it easier to tune retry behavior.

🔎 Proposed refactor

Add to the constants section:

const (
	pullSecretTaskName = "pull-secret-mvp"
	maxRetries         = 3
	
	// defaultPullSecretData is a fake pull secret used for testing when PULL_SECRET_DATA is not provided
	// gitleaks:allow
	defaultPullSecretData = `{"auths":{"cloud.openshift.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"quay.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.connect.redhat.com":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"},"registry.redhat.io":{"auth":"ZmFrZXVzZXI6ZmFrZXBhc3N3b3Jk","email":"user@example.com"}}}`
)

Then update the retry calls:

 	// Create or update secret with retry logic
 	if err := retryWithBackoff(ctx, func() error {
 		return e.createOrUpdateSecret(ctx, client)
-	}, 3); err != nil {
+	}, maxRetries); err != nil {
 		logStructured("error", e.ClusterID, e.GCPProjectID, "create-update-secret", 0, fmt.Sprintf("Failed to create/update secret: %v", err), "")
 		return err
 	}
 
 	// Verify secret is accessible
 	if err := retryWithBackoff(ctx, func() error {
 		return e.verifySecret(ctx, client)
-	}, 3); err != nil {
+	}, maxRetries); err != nil {
 		logStructured("error", e.ClusterID, e.GCPProjectID, "verify-secret", 0, fmt.Sprintf("Failed to verify secret: %v", err), "")
 		return err
 	}

333-333: Consider using structured logging for consistency.

The retry logging uses log.Printf, while the rest of the codebase uses logStructured for structured JSON output. Using structured logging here would improve consistency and log parsing.

🔎 Proposed refactor
-			log.Printf("Retry %d/%d after %s: %v", i+1, maxRetries, backoff, err)
+			logStructured("info", "", "", "retry", 0, 
+				fmt.Sprintf("Retry %d/%d after %s: %v", i+1, maxRetries, backoff, err), "")

Note: This assumes cluster_id and gcp_project are unavailable in this utility context. Alternatively, refactor to pass these as parameters if structured logging with full context is desired.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a2da97 and 4340c17.

📒 Files selected for processing (1)
  • cmd/pull-secret/jobs/pull_secret.go (1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.30.0)
cmd/pull-secret/jobs/pull_secret.go

[high] 23-23: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 23-23: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 23-23: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 23-23: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (3)
cmd/pull-secret/jobs/pull_secret.go (3)

42-90: LGTM!

The job interface implementation is clean and correct. The auto-derivation of secretName from clusterID (lines 62-63) and the fallback to fake credentials for testing (lines 57-58) are well-handled.


155-312: LGTM!

The helper methods are well-structured with appropriate error handling. The label set in createSecret (lines 237-243) provides good resource tracking metadata, and the validation logic in validatePullSecret (lines 301-309) correctly ensures the pull secret has the required structure.


354-380: LGTM!

The structured logging implementation is solid. The fallback to log.Printf in the error case (line 375) is appropriate defensive coding when JSON marshaling fails.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
cmd/pull-secret/jobs/pull_secret.go (3)

8-8: Seed the random number generator for proper jitter distribution.

The math/rand package requires seeding; without it, jitter values will be deterministic across runs, which defeats the purpose of preventing thundering herd scenarios in retry backoff.

🔎 Recommended fix

Option 1: Seed math/rand in an init() function:

+func init() {
+	rand.Seed(time.Now().UnixNano())
+}

Option 2 (preferred for Go 1.20+): Use math/rand/v2 which auto-seeds:

-	"math/rand"
+	cryptorand "crypto/rand"
+	"math/big"

Then update line 329 to use a cryptographically secure random source:

// In retryWithBackoff function:
randVal, _ := cryptorand.Int(cryptorand.Reader, big.NewInt(1000))
jitter := time.Duration((float64(randVal.Int64())/500.0 - 1.0) * jitterRange)

234-240: Consider making the hyperfleet-version label configurable.

The hard-coded "hyperfleet-version": "v1" label may require updates in future releases. While not critical for the initial implementation, consider making this configurable or deriving it from a build-time constant.


332-332: Consider using consistent logging approach.

Line 332 uses log.Printf for retry messages, while the rest of the code uses logStructured for JSON-formatted logs. Similarly, line 378 uses fmt.Println instead of log.Printf (used on line 374 for errors).

For consistency with the structured logging pattern used throughout the file, consider converting the retry log to structured format:

-			log.Printf("Retry %d/%d after %s: %v", i+1, maxRetries, backoff, err)
+			// Note: Would need cluster_id and gcp_project passed to retryWithBackoff
+			logStructured("warn", "", "", "retry", 0, 
+				fmt.Sprintf("Retry %d/%d after %s: %v", i+1, maxRetries, backoff, err), "")

Alternatively, if unstructured logs are acceptable for retry messages, use log.Printf consistently on line 378 as well.

Also applies to: 378-378

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4340c17 and 3455f69.

📒 Files selected for processing (1)
  • cmd/pull-secret/jobs/pull_secret.go (1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.30.0)
cmd/pull-secret/jobs/pull_secret.go

[high] 24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)


[high] 24-24: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (7)
cmd/pull-secret/jobs/pull_secret.go (7)

24-24: Static analysis false positive: test credentials are intentionally fake.

The Gitleaks alerts on this line are false positives. The defaultPullSecretData constant contains clearly documented fake credentials for testing purposes (base64 decodes to "fakeuser:fakepassword"). This is appropriate for a test fallback.


27-37: LGTM!

The type definitions are clean and appropriate. The DryRun field is properly propagated from job to task.


74-87: LGTM!

The job interface methods are correctly implemented. Worker count of 1 is appropriate for this single-task job.


89-150: Well-structured processing logic with comprehensive error handling.

The Process method correctly orchestrates validation, client initialization, secret operations, and verification with appropriate retry logic and structured logging. The dry-run mode implementation is clean and informative.


325-330: Jitter calculation is now correct.

The jitter implementation correctly produces values in the range ±20% of baseBackoff, addressing the concern from the previous review. The formula (rand.Float64()*2 - 1) * jitterRange properly generates random values between -jitterRange and +jitterRange.

Note: This resolves the past review comment about fixing jitter calculation to match documented ±20% behavior.


291-309: LGTM!

The pull secret validation correctly verifies JSON format and ensures the required auths structure is present and non-empty.


344-351: LGTM!

The isRetryable function appropriately identifies transient gRPC errors that should be retried: Unavailable, DeadlineExceeded, Internal, and ResourceExhausted.

Comment on lines +54 to +56
if pullSecretData == "" {
pullSecretData = defaultPullSecretData
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add logging for the fallback to fake pull secret data.

The silent fallback to defaultPullSecretData when PULL_SECRET_DATA is unset could mask configuration issues in production. While the comment indicates this is for testing, there's no runtime indication that fake credentials are being used.

🔎 Recommended fix

Add a log entry to make the fallback explicit:

 	// Use fake pull secret for testing if PULL_SECRET_DATA is not provided
 	if pullSecretData == "" {
+		log.Println("WARNING: PULL_SECRET_DATA not set, using fake test credentials")
 		pullSecretData = defaultPullSecretData
 	}

Alternatively, consider making this an error in non-dry-run scenarios or add an explicit test mode flag.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if pullSecretData == "" {
pullSecretData = defaultPullSecretData
}
if pullSecretData == "" {
log.Println("WARNING: PULL_SECRET_DATA not set, using fake test credentials")
pullSecretData = defaultPullSecretData
}
🤖 Prompt for AI Agents
In cmd/pull-secret/jobs/pull_secret.go around lines 54 to 56, the code silently
falls back to defaultPullSecretData when PULL_SECRET_DATA is empty; add a
runtime log message indicating the fallback to fake/test pull secret data so
operators know fake credentials are being used. Update the surrounding logic to
call the package logger (or the existing logger instance) to emit a
warning/info-level message that includes context (e.g., "PULL_SECRET_DATA not
set, using default test pull secret data"), and consider making this a
higher-severity error or gated behind a test/dry-run flag in non-test
environments if desired.

Copy link
Copy Markdown
Contributor

@ciaranRoche ciaranRoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main feedback on this would be that we do not have a k8s directory and instead add charts directory so we can follow the same deployment model as the other adapters. To avoid using scripts we wrap it up with helm

// createSecret creates a new secret in GCP Secret Manager
func (e PullSecretTask) createSecret(ctx context.Context, client *secretmanager.Client) error {
req := &secretmanagerpb.CreateSecretRequest{
Parent: fmt.Sprintf("projects/%s", e.GCPProjectID),
Copy link
Copy Markdown

@tirthct tirthct Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe projects/%s... is being used as the secret path. Maybe we can create a method/constant to get this path? This way we don't need to search and modify it in multiple places if we decide to update this later

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I’ve refactored the secret path construction to make it cleaner and more consistent.

@ldornele ldornele changed the title Handling the GKE cluster using the pullsecret-adapter service account [HYPERFLEET-273]: Implement MVP pull-secret job for GCP Secret Manager with Helm deployment Jan 8, 2026
@ciaranRoche ciaranRoche merged commit 812a509 into openshift-hyperfleet:main Jan 8, 2026
4 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants