High Availability

Configure and manage Proxmox VE HA: groups, resource policies, and failover behavior.

High Availability

Proxmox VE HA automatically restarts VMs on a healthy node when a node failure is detected. Cloud-PVE pre-configures the HA stack (Corosync, fencing, watchdog) for you.

How HA works

Corosync monitors cluster heartbeats between nodes.
If a node misses heartbeats beyond the timeout, it is declared offline.
Fencing (STONITH) isolates the failed node (power-off via IPMI/iDRAC) to prevent split-brain.
HA Manager restarts the VMs that were running on the failed node on surviving nodes.

The entire process takes 20–60 seconds depending on your watchdog and fencing configuration.

Enabling HA for a VM

Go to Datacenter → HA → Resources
Click Add
Select the VM and set:
- Max Restart: number of restart attempts (default: 1)
- Max Relocate: number of migration attempts before restart (default: 1)
- Group: assign to an HA group (optional)

HA Groups

HA groups define node preferences for VM placement. Go to Datacenter → HA → Groups:

Group: production
Nodes: node1:3, node2:2, node3:1

Higher priority numbers mean the node is preferred. VMs in this group will prefer node1, fall back to node2, then node3.

Resource states

State	Meaning
`started`	VM should be running, HA ensures it stays running
`stopped`	VM should be stopped, HA won’t restart it
`disabled`	HA management disabled for this VM
`ignored`	HA ignores this VM

Testing failover

To test HA without real hardware failure:

# On the node to test (run as root)
systemctl stop pve-cluster corosync

Watch the Datacenter → HA view, within ~30 seconds, your VMs should appear on another node.

Important: Only simulate failure on one node at a time. With a 3-node cluster, losing 2 nodes simultaneously breaks quorum.

Monitoring HA

Check the HA status:

ha-manager status

View HA logs:

journalctl -u pve-ha-lrm -n 50
journalctl -u pve-ha-crm -n 50

Deployment

Getting Started

Backup & Storage

High Availability

High Availability

How HA works

Enabling HA for a VM

HA Groups

Resource states

Testing failover

Monitoring HA