You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched the issues and found no similar issues.
KubeRay Component
ray-operator
What happened + What you expected to happen
When I set workerGroupSpecs.numOfHosts greater than 1, such as 2, the raycluster fails to transit Status.State to Ready, which would block RayJob from submitting as a side effect.
logger.Info("Wait for the RayCluster.Status.State to be ready before submitting the job.", "RayCluster", rayClusterInstance.Name, "State", rayClusterInstance.Status.State) //nolint:staticcheck // https://github.com/ray-project/kuberay/pull/2288
Related logs in kuberay operator: {"level":"info","ts":"2025-04-07T05:28:27.310Z","logger":"controllers.RayCluster","msg":"inconsistentRayClusterStatus","RayCluster":{"name":"henry-ray-plan-off-raycluster-d9ht7","namespace":"ray"},"reconcileID":"65a566fd-4f94-4492-b8f6-eacfcdf3b2cd","oldReadyWorkerReplicas":4,"newReadyWorkerReplicas":5,"oldAvailableWorkerReplicas":5,"newAvailableWorkerReplicas":6,"oldDesiredWorkerReplicas":3,"newDesiredWorkerReplicas":3,"oldMinWorkerReplicas":1,"newMinWorkerReplicas":1,"oldMaxWorkerReplicas":6,"newMaxWorkerReplicas":6}
Search before asking
KubeRay Component
ray-operator
What happened + What you expected to happen
When I set
workerGroupSpecs.numOfHosts
greater than 1, such as 2, the raycluster fails to transit Status.State to Ready, which would block RayJob from submitting as a side effect.kuberay/ray-operator/controllers/ray/rayjob_controller.go
Lines 193 to 196 in 5a02146
It seems related with the DesiredWorkerReplicas calculation, which only stands for single host case.
kuberay/ray-operator/controllers/ray/raycluster_controller.go
Lines 1299 to 1304 in 5a02146
Related logs in kuberay operator:
{"level":"info","ts":"2025-04-07T05:28:27.310Z","logger":"controllers.RayCluster","msg":"inconsistentRayClusterStatus","RayCluster":{"name":"henry-ray-plan-off-raycluster-d9ht7","namespace":"ray"},"reconcileID":"65a566fd-4f94-4492-b8f6-eacfcdf3b2cd","oldReadyWorkerReplicas":4,"newReadyWorkerReplicas":5,"oldAvailableWorkerReplicas":5,"newAvailableWorkerReplicas":6,"oldDesiredWorkerReplicas":3,"newDesiredWorkerReplicas":3,"oldMinWorkerReplicas":1,"newMinWorkerReplicas":1,"oldMaxWorkerReplicas":6,"newMaxWorkerReplicas":6}
Reproduction script
part of workerGroupSpecs
Anything else
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: