Skip to content

Commit fabb0f9

Browse files
tanmayv25dagil-nvidia
authored andcommitted
fix: Increase the failure threshold for k8s dsr1 trtllm wideep deploy.yaml (#4557)
Signed-off-by: Dan Gil <[email protected]>
1 parent 45d101d commit fabb0f9

File tree

1 file changed

+2
-2
lines changed
  • recipes/deepseek-r1/trtllm/disagg/wide_ep/gb200

1 file changed

+2
-2
lines changed

recipes/deepseek-r1/trtllm/disagg/wide_ep/gb200/deploy.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ spec:
172172
initialDelaySeconds: 30
173173
periodSeconds: 10
174174
timeoutSeconds: 5
175-
failureThreshold: 500
175+
failureThreshold: 600
176176
volumeMounts:
177177
- name: prefill-config-volume
178178
mountPath: /config
@@ -230,7 +230,7 @@ spec:
230230
initialDelaySeconds: 30
231231
periodSeconds: 10
232232
timeoutSeconds: 5
233-
failureThreshold: 500
233+
failureThreshold: 600
234234
volumeMounts:
235235
- name: decode-config-volume
236236
mountPath: /config

0 commit comments

Comments
 (0)