Reduce HA Overhead with Custom Admission Control
VMware's HA feature is good enough for most availability solutions, but it has caused me more grief than benefit over the years. The VI3 era product had some issues that caused virtual machines to reboot, and even on vSphere we still find ourselves reconfiguring a host for HA when minor errors occur.
All of that I can live with, but what really has burned my stomach over the years is the invisible inventory that is reserved with admission control. Admission control is the overhead that is reserved, which I refer to as invisible; to provide the compute and memory capacity to pick up the workload on virtual machines that are disconnected from a host.
Admission control is a complicated beast with the default settings being 'good enough' for most installations. If you have a small cluster, you may find yourself running up into brick walls, where admission control prohibits you from powering on additional virtual machines.
In those situations, it is time to consider two key configuration points. The first is to set HA rules that specify which virtual machines are to be powered on in an HA event. There can be clear examples, such as development systems not being set to power on during an HA event.
The other strategy is to set custom admission control values on the cluster's HA settings (see Fig. 1).
|Figure 1. Customizing the admission control value can allow you to reserve a designated amount of capacity in the HA cluster. (Click image to view larger version.)
You have two options for setting HA admission control options: percentage of cluster resources and specifying a failover host.
In the case of a smaller cluster, you may find yourself wanting to go the percentage route. This way a single host, which may represent 50 percent, 33 percent or 25 percent of the total available compute and memory capacity contained in a two-, three- or four-node cluster, respectively. A single host reserved in the invisible inventory associated with admission control is a significant hit to the total resources without a break in your licensing costs. If individual virtual machines are assigned an HA action and a percentage of cluster resources is set for HA admission control, you can configure the reserved inventory to be more in line with your requirements.
Likewise, if you have a larger cluster that may include servers of mixed configuration (faster processors or more RAM), you may want to designate a specific host to act as the failover node.
The default option of one host failure to reserve for an HA event can be overly cautious for many environments. If admission control errors start to show up, reconfiguring HA in one of these two ways coupled with individual virtual machine response configurations will address these limits.
Posted by Rick Vanover on 06/03/2010 at 12:47 PM