You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
Magnum cluster-autoscaler in case of manually removed node from node group, can remove all nodes from the node group before the autoscaler removes the correct node.
Version:
Cluster Autoscaler: v1.27.5,v1.29.5
Cloud Provider: Magnum
Current Behavior:
Magnum cluster autoscaler fails to retrieve ID of manually removed node during resize on cleanup
Reproduction Steps:
Create a magnum cluster (k8s ver 1.27 ,1.29 ) with cluster autoscaler (v1.27.5,v1.29.5) running with log level 5 .
Add to it node group.
Add a workload that scales a node group to, say, 4 nodes.
Remove manualy the first node in the node group(by openstack server delete ...).
The autoscaler will start cleaning up random nodes, by resizing without ID nodes to remove
Description:
Magnum cluster-autoscaler in case of manually removed node from node group, can remove all nodes from the node group before the autoscaler removes the correct node.
Version:
Cluster Autoscaler: v1.27.5,v1.29.5
Cloud Provider: Magnum
Current Behavior:
Magnum cluster autoscaler fails to retrieve ID of manually removed node during resize on cleanup
Expected Behavior:
The Magnum autoscaler will be able to retrieve the ID of a manually deleted node, which in this case is a fake node with the prefix openstack:/// not fake:///.
https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-release-1.27/cluster-autoscaler/cloudprovider/magnum/magnum_manager_impl.go#L379-L386
Reproduction Steps:
Create a magnum cluster (k8s ver 1.27 ,1.29 ) with cluster autoscaler (v1.27.5,v1.29.5) running with log level 5 .
Add to it node group.
Add a workload that scales a node group to, say, 4 nodes.
Remove manualy the first node in the node group(by openstack server delete ...).
The autoscaler will start cleaning up random nodes, by resizing without ID nodes to remove
which, in a pessimistic case like the one above, may result in the clearing of all nodes in node group.
Describe the solution you'd like:
Add here
https://github.com/kubernetes/autoscaler/blob/cluster-autoscaler-release-1.27/cluster-autoscaler/cloudprovider/magnum/magnum_manager_impl.go#L379-L386
support for fake nodes with the prefix openstack:///
e.g. like this
where parseFakeProviderIDDeletedNode is an additional util function like this
Additional context:
After looking in the code in the main branch, I see that there this case will also occur.
The text was updated successfully, but these errors were encountered: