|
| 1 | +# Strict Server-Side Field Validation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +By default, when your controller writes an object that contains fields not defined in the CRD schema, the API server: |
| 6 | + |
| 7 | +- Accepts the request |
| 8 | +- Drops the unknown fields |
| 9 | +- May only log a warning |
| 10 | + |
| 11 | +This can hide bugs and version skew between: |
| 12 | +- The controller code (Go types) and |
| 13 | +- The CRD schema installed in the cluster |
| 14 | + |
| 15 | +`controller-runtime` exposes `client.WithFieldValidation` to turn on strict server-side field validation for all client writes. When enabled, the API server returns a hard error instead of silently dropping unknown fields. |
| 16 | + |
| 17 | +We **do not enable this by default** in scaffolds because it can be too aggressive during upgrades. Instead, we show how to wire it as an opt-in flag. |
| 18 | + |
| 19 | +## What it solves? |
| 20 | + |
| 21 | +**Silent failure example:** |
| 22 | + |
| 23 | +You add a new field `status.newField` to your controller. The CRD wasn't updated yet. The controller calls `client.Status().Patch(...)`. |
| 24 | + |
| 25 | +**Without strict validation:** |
| 26 | +- API server drops `status.newField` silently |
| 27 | +- Controller sees no error |
| 28 | +- Field never appears on the object → confusing debugging |
| 29 | + |
| 30 | +**With strict validation:** |
| 31 | +- API server returns clear error |
| 32 | +- Controller knows CRDs need updating |
| 33 | +- Fails fast instead of silent data loss |
| 34 | + |
| 35 | +## Upgrade scenario example |
| 36 | + |
| 37 | +**Given:** CRD installed without `status.newField`, new controller version adds it |
| 38 | + |
| 39 | +**When:** New controller runs against old CRDs: |
| 40 | + |
| 41 | +```go |
| 42 | +if err := r.Status().Patch(ctx, foo, patch); err != nil { |
| 43 | + // handle error |
| 44 | +} |
| 45 | +``` |
| 46 | + |
| 47 | +**Without strict validation:** |
| 48 | +- API accepts request, drops `status.newField` |
| 49 | +- Controller sees no error |
| 50 | +- Debugging is hard |
| 51 | + |
| 52 | +**With strict validation:** |
| 53 | +- API rejects with 400 BadRequest |
| 54 | +- Controller gets error, logs it |
| 55 | +- Clear signal: CRD–controller mismatch |
| 56 | + |
| 57 | +This catches bugs fast, but means "new controller + old CRDs" causes errors until CRDs update. That's why it's **off by default**. |
| 58 | + |
| 59 | +## How strict field validation works |
| 60 | + |
| 61 | +`controller-runtime` lets you wrap a client: |
| 62 | + |
| 63 | +```go |
| 64 | +strictClient := client.WithFieldValidation( |
| 65 | + baseClient, |
| 66 | + metav1.FieldValidationStrict, |
| 67 | +) |
| 68 | +``` |
| 69 | + |
| 70 | +All write operations (Create, Update, Patch) from `strictClient` send `fieldValidation=strict` to the API server. |
| 71 | + |
| 72 | +The API server: |
| 73 | +- Returns an error when the payload has unknown or invalid fields |
| 74 | +- Does not perform the write |
| 75 | + |
| 76 | +You can still override per call: |
| 77 | + |
| 78 | +```go |
| 79 | +cli.Create(ctx, obj, client.FieldValidation(metav1.FieldValidationWarn)) |
| 80 | +``` |
| 81 | + |
| 82 | + |
| 83 | +## When to use it |
| 84 | + |
| 85 | +**Good cases to turn it on (for dev, CI, or even prod):** |
| 86 | + |
| 87 | +You own both the CRDs and the controllers. Your upgrade process applies CRDs first or together. You want to fail fast when: |
| 88 | +- A controller writes fields not in the schema, or |
| 89 | +- There is a bug in your types/conversions |
| 90 | + |
| 91 | +You mostly use typed schemas, or explicitly mark dynamic blobs with `x-kubernetes-preserve-unknown-fields: true`. |
| 92 | + |
| 93 | + |
| 94 | +## When NOT to use it |
| 95 | + |
| 96 | +Avoid strict validation in production when: |
| 97 | + |
| 98 | +- Controllers and CRDs upgrade independently (common in Helm, OLM deployments) |
| 99 | +- You manage third-party CRDs whose schemas evolve independently |
| 100 | +- Your CRDs use unstructured/dynamic data without `x-kubernetes-preserve-unknown-fields` |
| 101 | +- You need upgrade tolerance when controller and CRD versions are temporarily mismatched |
| 102 | + |
| 103 | +In these scenarios, strict validation causes BadRequest errors during upgrades. That's why it's: |
| 104 | +- **Off by default** in scaffolds |
| 105 | +- **Opt-in via flag** for those who need it |
| 106 | + |
| 107 | +<aside class="warning"> |
| 108 | +<h1>Not included in default scaffold</h1> |
| 109 | + |
| 110 | +This feature is **not scaffolded by default** because it requires careful deployment coordination. |
| 111 | + |
| 112 | +**The problem:** Standard deployment tools (`make deploy`, `helm install`) apply CRDs and controller simultaneously with no ordering guarantees. When strict validation is enabled and the controller starts before CRDs finish updating, **all writes fail** until manual intervention. |
| 113 | + |
| 114 | +**The solution:** You need external tooling (separate Helm charts, CI/CD pipeline stages, custom scripts) to ensure CRDs are upgraded and established before the controller starts. |
| 115 | + |
| 116 | +</aside> |
| 117 | + |
| 118 | +## Wiring an opt-in flag in cmd/main.go |
| 119 | + |
| 120 | +This feature is **not scaffolded by default**. Follow these steps to add it manually. |
| 121 | + |
| 122 | +### Step 1: Add the strictManager wrapper |
| 123 | + |
| 124 | +In `cmd/main.go`, add this type definition after the `init()` function: |
| 125 | + |
| 126 | +```go |
| 127 | +// strictManager wraps the manager to reject unknown fields instead of silently dropping them. |
| 128 | +// When the controller writes a field that doesn't exist in the CRD, the write fails immediately. |
| 129 | +// This helps catch typos and version mismatches between your code and cluster CRDs. |
| 130 | +type strictManager struct { |
| 131 | + ctrl.Manager |
| 132 | + strictClient client.Client |
| 133 | +} |
| 134 | + |
| 135 | +func (m *strictManager) GetClient() client.Client { |
| 136 | + return m.strictClient |
| 137 | +} |
| 138 | +``` |
| 139 | + |
| 140 | +### Step 2: Add required imports |
| 141 | + |
| 142 | +Add these imports to `cmd/main.go`: |
| 143 | + |
| 144 | +```go |
| 145 | +import ( |
| 146 | + // ... your existing imports ... |
| 147 | + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" |
| 148 | + "sigs.k8s.io/controller-runtime/pkg/client" |
| 149 | +) |
| 150 | +``` |
| 151 | + |
| 152 | +### Step 3: Add the command-line flag |
| 153 | + |
| 154 | +In the `main()` function, where other flags are defined, add: |
| 155 | + |
| 156 | +```go |
| 157 | +func main() { |
| 158 | + var metricsAddr string |
| 159 | + var enableLeaderElection bool |
| 160 | + var probeAddr string |
| 161 | + var strictFieldValidation bool // Add this |
| 162 | + |
| 163 | + flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "...") |
| 164 | + flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "...") |
| 165 | + flag.BoolVar(&enableLeaderElection, "leader-elect", false, "...") |
| 166 | + |
| 167 | + // Add this flag |
| 168 | + flag.BoolVar(&strictFieldValidation, "strict-field-validation", false, |
| 169 | + "Reject unknown fields instead of dropping them. Useful for dev/CI, NOT recommended for production.") |
| 170 | + |
| 171 | + // ... rest of your code ... |
| 172 | +} |
| 173 | +``` |
| 174 | + |
| 175 | +### Step 4: Wrap the manager conditionally |
| 176 | + |
| 177 | +After creating the manager with `ctrl.NewManager()`, add this wrapper logic: |
| 178 | + |
| 179 | +```go |
| 180 | +mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{ |
| 181 | + Scheme: scheme, |
| 182 | + // ... your other options ... |
| 183 | +}) |
| 184 | +if err != nil { |
| 185 | + setupLog.Error(err, "unable to start manager") |
| 186 | + os.Exit(1) |
| 187 | +} |
| 188 | + |
| 189 | +// Strict field validation: NOT RECOMMENDED for production by default. |
| 190 | +// |
| 191 | +// When enabled, the controller rejects writes with unknown fields instead of silently dropping them. |
| 192 | +// This is useful for catching bugs in development, but causes problems in production when you upgrade |
| 193 | +// the controller before the CRDs - all writes will fail until CRDs are updated. |
| 194 | +// |
| 195 | +// Safe for: development, CI, and production only with external tooling to ensure CRDs upgrade first. |
| 196 | +// Not safe for: make deploy, helm install, or when you apply everything at once. The scaffolded project |
| 197 | +// has no built-in mechanism to ensure CRDs upgrade before the controller - you need external solutions. |
| 198 | +var finalMgr ctrl.Manager = mgr |
| 199 | +if strictFieldValidation { |
| 200 | + finalMgr = &strictManager{ |
| 201 | + Manager: mgr, |
| 202 | + strictClient: client.WithFieldValidation( |
| 203 | + mgr.GetClient(), |
| 204 | + metav1.FieldValidationStrict, |
| 205 | + ), |
| 206 | + } |
| 207 | +} |
| 208 | + |
| 209 | +// Use finalMgr for all subsequent setup |
| 210 | +if err := (&controller.MyReconciler{ |
| 211 | + Client: finalMgr.GetClient(), |
| 212 | + Scheme: finalMgr.GetScheme(), |
| 213 | +}).SetupWithManager(finalMgr); err != nil { |
| 214 | + setupLog.Error(err, "unable to create controller", "controller", "My") |
| 215 | + os.Exit(1) |
| 216 | +} |
| 217 | + |
| 218 | +// Continue using finalMgr for health checks, starting manager, etc. |
| 219 | +if err := finalMgr.AddHealthzCheck("healthz", healthz.Ping); err != nil { |
| 220 | + setupLog.Error(err, "unable to set up health check") |
| 221 | + os.Exit(1) |
| 222 | +} |
| 223 | + |
| 224 | +if err := finalMgr.Start(ctrl.SetupSignalHandler()); err != nil { |
| 225 | + setupLog.Error(err, "problem running manager") |
| 226 | + os.Exit(1) |
| 227 | +} |
| 228 | +``` |
| 229 | + |
| 230 | +<aside class="note"> |
| 231 | +<h1>Important: Use finalMgr everywhere</h1> |
| 232 | + |
| 233 | +After wrapping the manager, use `finalMgr` instead of `mgr` for: |
| 234 | +- Controller setup |
| 235 | +- Webhook setup |
| 236 | +- Health checks |
| 237 | +- Starting the manager |
| 238 | + |
| 239 | +This ensures all components use the wrapped client with strict validation. |
| 240 | + |
| 241 | +</aside> |
0 commit comments