Everyday Virtualization

Blog archive

vSphere Alert for Failed Migration

Nothing is more frustrating than determining that a virtual machine cannot be migrated when you need to migrate a number of guests, such as when you're putting a host in maintenance mode. If DRS is in use on a vSphere cluster at any level of the fully automated configuration, vCenter can generate recommendations and issue the vMotion instruction to the cluster.

Cluster-issued vMotion commands are easy to catch, as they are logged as being initiated by "system" instead of the person's username who may have sent an individual VM to be migrated. Depending on the migration threshold configuration (the sliding bar) for the cluster, this may be a frequent or a rare occurrence to have the cluster issue the command to migrate a guest.

While that is a fine set of functionality -- and I've used it a lot over the years -- vMotion occasionally will stop working on a host. Any number of issues can cause the stoppage, such as networking connectivity gaps, time offset or other factors. Usually the fix is a quick reset of the Advanced | Migrate | Migrate.Enabled value from 1 to 0 and then reset it back to 1. VMware KB 1013150 explains how to reset this on an ESX(i) host.

If this situation or some other obstacle to migrate a host arises, the vCenter Server will continually try to migrate the virtual machine and log its failure as often as the DRS refresh interval is configured. The solution to get a heads up on a potential issue with an ESX(i) host is to configure the migration error alarm (see Fig. 1) to send an e-mail, trap or page of the event (see Fig. 1).

Alarms are going off

Figure 1. The conditions for the migration failure can be defined, customized and set for actionable alerts. (Click image to view larger version.)

This little step along with corrective action can help keep the DRS algorithm in check for a cluster, so that the host workload stays balanced as intended by the cluster design.

This task is disabled by default, yet logged in the recent tasks of the vSphere Client as well as in the database; but it is easy to miss.

Have you utilized this alarm? How so, share your comments here.

Posted by Rick Vanover on 12/14/2010 at 12:48 PM


Subscribe on YouTube