Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACS is not able to restart VM during HA process #10435

Open
GerorgeEG opened this issue Feb 20, 2025 · 2 comments
Open

ACS is not able to restart VM during HA process #10435

GerorgeEG opened this issue Feb 20, 2025 · 2 comments

Comments

@GerorgeEG
Copy link

problem

We are testing HA and shutting down the KVM hypervisor through BMC, host status changed to down and ACS tried to start VM on another host but it gets failed.

versions

ACS Version : 4.19.1.2

KVM : RHEL 8

Storage : NFS v3

The steps to reproduce the bug

  1. Shut down the KVM hypervisor through BMC.
    2.Both host and VM is HA enabled
  2. Wait for the status of hosts to change to down
  3. ACS tries to start VM but getting failed.

Below is the log

2025-02-18 02:33:48,554 DEBUG [c.c.c.CapacityManagerImpl] (Work-Job-Executor-3:ctx-dae42fc9 job-17413/job-17446 ctx-06b6ee00) (logid:6675dfa0) VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"} state transited from [Starting] to [Stopped] with event [OperationFailed]. VM's original host: Host {"id":85,"name":" host1.xx.xxx.xxx ","type":"Routing","uuid":"804bcf95-e073-462e-810a-aa64e85c78bd"}, new host: null, host before state transition: Host {"id":127,"name":"host2.xx.xxx.xxx","type":"Routing","uuid":"a9698e0c-9c63-4392-ae28-b7dbdceffd9d"}

2025-02-18 02:33:48,580 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-3:ctx-dae42fc9 job-17413/job-17446 ctx-06b6ee00) (logid:6675dfa0) Invocation exception, caused by: com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"}Scope=interface com.cloud.dc.DataCenter; id=1

2025-02-18 02:33:48,580 INFO [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-3:ctx-dae42fc9 job-17413/job-17446 ctx-06b6ee00) (logid:6675dfa0) Rethrow exception com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"}Scope=interface com.cloud.dc.DataCenter; id=1

com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"}Scope=interface com.cloud.dc.DataCenter; id=1

2025-02-18 02:33:48,639 WARN [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-10bdb53f work-1129) (logid:2a0083a1) Unable to restart VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"} due to Unable to create a deployment for VM instance {"id":541,"instanceName":"i-19-541-VM","type":"User","uuid":"130c856a-d5e4-4745-9a6a-c41c2508573a"}

What to do about it?

we need HA functionality to make sure VM gets restarted in case of KVM host getting down due to any issue.

Copy link

boring-cyborg bot commented Feb 20, 2025

Thanks for opening your first issue here! Be sure to follow the issue template!

@shwstppr
Copy link
Contributor

@GerorgeEG you seem to be getting an InsufficientServerCapacityException when the VM is being started on a different host. Do you have free compute (in the same cluster if using Cluster scope storage)?
Check if the issues is related to host/storage tags used.
You may have to check logs around this InsufficientServerCapacityException

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants