You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/docs/concepts/workloads/autoscaling/vertical-pod-autoscale.md
+12-7Lines changed: 12 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,7 +91,7 @@ Figure 1. VerticalPodAutoscaler controls the resource requests and limits of Pod
91
91
92
92
Kubernetes implements vertical pod autoscaling through multiple cooperating components that run intermittently (it is not a continuous process). The VPA consists of three main components:
93
93
94
-
* The _recommender*, which analyzes resource usage and provides recommendations.
94
+
* The _recommender_, which analyzes resource usage and provides recommendations.
95
95
* The _updater_, that Pod resource requests either by evicting Pods or modifying them in place.
96
96
* And the VPA _admission controller_ webhook, which applies resource recommendations to new or recreated Pods.
97
97
@@ -100,7 +100,6 @@ Once during each period, the Recommender queries the resource utilization for Po
100
100
The Recommender analyzes both current and historical resource usage data (CPU and memory) for each Pod targeted by the VerticalPodAutoscaler. It examines:
101
101
- Historical consumption patterns over time to identify trends
102
102
- Peak usage and variance to ensure sufficient headroom
103
-
- Current resource requests compared to actual usage
104
103
- Out-of-memory (OOM) events and other resource-related incidents
105
104
106
105
Based on this analysis, the Recommender calculates three types of recommendations:
@@ -119,9 +118,7 @@ The chosen method depends on the configured update mode, cluster capabilities, a
119
118
120
119
The _admission controller_ operates as a mutating webhook that intercepts Pod creation requests. It
121
120
checks if the Pod is targeted by a VerticalPodAutoscaler and, if so, applies the recommended
122
-
resource requests and limits before the Pod is created. This ensures new Pods start with
123
-
appropriately sized resource allocations, whether they're created during initial deployment,
124
-
after an eviction by the updater, or due to scaling operations.
121
+
resource requests and limits before the Pod is created. More specifically, the admission controller uses the Target recommendation in the VerticalPodAutoscaler resource's `.status.recommendation` stanza as the new resource requests. The admission controller ensures new Pods start with appropriately sized resource allocations, whether they're created during initial deployment, after an eviction by the updater, or due to scaling operations.
125
122
126
123
The VerticalPodAutoscaler requires a metrics source, such as Kubernetes' Metrics Server {{< glossary_tooltip text="add-on" term_id="addons" >}},
127
124
to be installed in the cluster.
@@ -158,7 +155,7 @@ You can use a tool such as `kubectl` to view the `.status` and the recommendatio
158
155
159
156
### Initial {#updateMode-Initial}
160
157
161
-
In _Initial_ mode, VPA only sets resource requests when Pods are first created. It does not update resources for already running Pods, even if recommendations change over time.
158
+
In _Initial_ mode, VPA only sets resource requests when Pods are first created. It does not update resources for already running Pods, even if recommendations change over time. The recommendations apply only during Pod creation.
162
159
163
160
### Recreate {#updateMode-Recreate}
164
161
@@ -172,6 +169,8 @@ controller applies the updated resource requests to the new Pod.
172
169
In `InPlaceOrRecreate` mode, VPA attempts to update Pod resource requests and limits without restarting the Pod when possible. However, if in-place updates cannot be performed for a particular resource change, VPA falls back to evicting the Pod
173
170
(similar to `Recreate` mode) and allowing the workload controller to create a replacement Pod with updated resources.
174
171
172
+
In this mode, the updater applies recommendations in-place using the [Resize Container Resources In-Place](/docs/tasks/configure-pod-container/resize-container-resources/) feature.
173
+
175
174
### Auto (deprecated) {#updateMode-Auto}
176
175
177
176
{{< note >}}
@@ -231,13 +230,19 @@ Valid resource names include `cpu` and `memory`.
231
230
The `controlledValues` field determines whether VPA controls resource requests, limits, or both:
232
231
233
232
RequestsAndLimits
234
-
: VPA sets both requests and limits. The limit is scaled proportionally to the request. This is the default mode.
233
+
: VPA sets both requests and limits. The limit scales proportionally to the request based on the request-to-limit ratio defined in the Pod spec. This is the default mode.
235
234
236
235
RequestsOnly
237
236
: VPA only sets requests, leaving limits unchanged. Limits are respected and can still trigger throttling or out-of-memory kills if usage exceeds them.
238
237
239
238
See [requests and limits](/docs/concepts/configuration/manage-resources-containers/#requests-and-limits) to learn more about those two concepts.
240
239
240
+
## LimitRange resources
241
+
242
+
The admission controller and updater VPA components post-process recommendations to comply with the constraints defined in [LimitRanges](/docs/concepts/policy/limit-range/). The LimitRange resources with `type` Pod and Container are checked in the Kubernetes cluster.
243
+
244
+
For example, if the `max` field in a Container LimitRange resource is exceeded, both VPA components lower the limit to the value defined in the `max` field, and the request is proportionally decreased to maintain the request-to-limit ratio in the Pod spec.
245
+
241
246
## {{% heading "whatsnext" %}}
242
247
243
248
If you configure autoscaling in your cluster, you may also want to consider using
0 commit comments