The Cranky Admin

vSphere 6.5: A Real-World Review

A new admin client headlines the changes, most of which are positive.

Also See: vSAN 6.5: A Real-World Review

vSphere 6.5 is out, and it marks the beginning of the end of the Adobe Flash-based Flex client! There are other features and enhancements offered up as part of vSphere 6.5, but the new HTML5 client is far and away the show-stealing feature of this release.

A wonderful idea in theory, and useful for smaller deployments in practice, the nearly universally-despised Flex client proved awful in large-scale deployments. This created a rift between certain customers and VMware that has lasted years. I was personally impressed with the Flex client when I first saw it. The lack of a C# client for OS X or Linux had proven frustrating to me many times, so a Web-based client was a welcome move by VMware.

Of course, shortly after the Flex client made its debut, the whole of the IT industry became a concerted effort to purge anything related to Adobe Flash. A terrible security threat, Flash was quarantined, if not outright banished, by Web browsers. Flash was sandboxed and limited in scope and functionality; in turn, this made the Flex client difficult to use in many environments.

Even if you could get logged in, the Flex client was slow. In the early releases, it was unusably slow as soon as you got beyond 32 nodes or so. While this was ameliorated in later releases, the Flex client remained much slower than its C#-based cousin at many tasks. VMware worked hard to wean customers off of the C# client for years while publicly denying anything was wrong with the Flex client. Customers stubbornly refused to give up the C# client, and a cold war broke out between vendor and customer.

With vSphere 6.5, the C# client no longer works at all. It will not connect. In order to prevent outright mutiny, especially in the face of credible competition from the likes of Microsoft, Nutanix, Scale Computing and Amazon (depending on your use case,) VMware created the HTML5 client.

The HTML5 Client
The HTML5 client is what the Flex client should have been to start with. It is smooth, responsive, clean, (mostly) space-efficient and platform agnostic. It's also unfinished.

A lot of features -- for example, everything to do with VSAN -- simply don't exist in the HTML5 client yet. The bare-bones stuff required for basic administration is there, and it works well. To tide us over until the HTML5 client is done, VMware has tweaked the Flex client to suck ever-so-slightly less. The pointless "related objects" tab and workflows are gone, and virtual machine (VM) rollups happen at 5,000 VMs instead of 50.

The HTML5 client is a tantalizing glimpse of VMware's future. As vSphere and ESXi sales flatten out, VMware is moving this now-mature product to a slower cadence. The HTML5 client demonstrates that VMware is aware that competent and capable competition exists in the business of selling things that make infrastructure go, and that to survive VMware must become a company that makes infrastructure easy to use. If VMware can hold its various teams to the high standard set by the HTML5 client, it should have nothing to worry about for years to come.

vCenter
vCenter is dead, long live vCenter! With vSphere 6.5, the time has come to abandon vCenter on Windows. The vSphere Update Manager (VUM) is now integrated with the vCenter Server appliance (VSCA), and the VCSA natively integrates high availability as well as backup and restore. This eliminates the need for a Windows-based vCenter server.

VSCA 6.5 now runs on Photon OS, is significantly faster, can scale to handle more objects than its 6.0 predecessor, and sports a built-in monitoring interface. It runs both the Flex and HTML5 clients, with the latter removing the need for the Client Integration Plugin.

Installing, upgrading or migrating to the 6.5 VCSA requires running a local executable. This executable is available for Windows, MacOS and Linux, and is provided in the form of an ISO. Overall, the VSCA is a sturdy, reliable and capable improvement over its predecessors; however, the installer is where VMware really fell down with the 6.5 release.

One issue is that, for the Windows version at least, the installer must be executed from a local drive. Mapped drives, network shares and folder redirection will all cause bizarre errors in the installer. This is a notable problem, as many corporate workstations no longer have optical drives these days, restrict copying files to the local drives in favor of folder redirection, and don't allow SPTD-based virtual optical drives, even on administrator machines. This is not documented in the readme on the ISO.

Once past that hurdle, the migration tool for Windows Server doesn't actually work all that well. Social media is full of reports of VMware experts running into various problems. If you have a Windows vCenter, it is probably best to engage with VMware's technical resources before attempting any migration to 6.5. The VCSA itself is ready to handle your production workloads, but the migration tool is not.

New Features
vSphere 6.5 comes with a surprising list of additional features for a point release. Chief among these are a set of security-oriented features that make vSphere 6.5 an attractive infrastructure under-layer to anyone considering standing up a public-facing cloud.

The banner security feature is VM-level disk encryption and the associated encrypted vMotion capability. Combined, this means that vSphere is now capable of securing data both at rest and in flight. vSphere 6.5 also supports secure boot for EFI-enabled VMs, and has increased its logging capabilities. Fault tolerance gains multi-NIC support and some tweaks around performance. The vSphere Integrated Containers Engine allows administrators to manage containers in vSphere just as they would VMs. I was deeply impressed by the Containers Engine, and find it the first properly user-friendly container management system I've tried. The most impressive feature upgrades, however, are related to high availability (HA) and disaster recovery services (DRS).

High Availability Upgrades
HA orchestrated restart is absolutely marvelous. Think service dependencies in Windows: an attempt to bring up a given machine will check the dependencies list to make sure any VMs it depends on are started. The canonical example is bringing up a database server before a front-end server. That this can be coded into the metadata on a per-VM level is really useful, and helps wrangle large clusters with thousands of VMs.

HA admission controls also got an overhaul. Administrators can now specify the number of host failures to tolerate, with overrides for CPU, RAM and knobs to tweak on allowable performance degradation. DRS gained some intelligence with network load awareness and tweaks relating to CPU overcommit. The memory reservation for load balancing can now be manually specified, and you can have DRS try to even out the number of VMs per host.

With Proactive HA, vSphere will keep an eye on hardware conditions and, if it senses something awry, will enable "quarantine mode." Quarantine mode evacuates VMs off of a questionable node. This can be configured to occur either by placing that host into maintenance mode and completely evacuating the VMs, or by only evacuating VMs if DRS is convinced it won't cause performance problems.

Hardware Compatibility List Concerns
While impressive -- and something that seems to work well in the lab -- I am worried about the implementation of Proactive HA. A related component of this is used by VSAN to detect if your hardware is listed on the VMware HCL.

An example of my concerns: I have a cluster whose components were listed on the HCL for vSphere 5.5 and 6.0, but are not for 6.5. VSAN seems quite upset about the LSI 2308 disk controllers, despite them working fine under a battery of tests. There is absolutely nothing wrong with that controller. It worked fine in 6.0, but a software version change and vSphere is asking me to upgrade my hardware.

On its own, this wouldn't be enough to worry me; however, this autodetection of hardware/HCL compatibility is combined with my own personal interactions with VMware employees who seem to have an antipathy towards anyone who might want to use hardware not on the HCL for any reason.

It seems perfectly natural to VMware employees to throw out hundreds of thousands or millions of dollars of hardware because of whatever bizarre internal politics, "support costs" or what-have-you dictate winners and losers on VMware's HCL. I wonder how this will affect support in the long term, and what impact the Dell acquisition will have.

Proactive HA is a technically impressive, highly useful feature that bears up well under lab testing. Unfortunately, I trust VMware with that kind of power about as much as I trust Microsoft with cumulative updates.

vSphere 6.5 in Practice
To say that vSphere 6.5 is ready for all use cases would be a lie. vSphere 6.5 is buggy (though VSAN more than passed muster). 6.5 is not as buggy as 6.0 was at release, but there are some really esoteric bugs making their way through the usual social media channels that give me pause.

Fortunately, most of the bugs are related to migration, or administrators doing unexpected things with new features. vSphere 6.5's features seem to work just fine, if handled precisely as envisioned. Concerns rest mostly around the fact that human nature is rather hard to change, and there needs to be a proper shakedown by "hold my beer, I got this" early adopter admins before everyone lets the juniors loose on 6.5. I expect that by 6.5 update 1, the showstoppers should be ironed out.

For the most part, vSphere 6.5 is an iterative release and behaves as such. Core functionality does what it did before, more or less in the same way. The big exception here is HA.

Pay attention to the changes in HA, and above all test this with copies of your production workloads before putting it into production! For all that vSphere is now a mature product, it is the lynchpin that unifies our infrastructure and all of us should be properly change managing our migrations.

About the Author

Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley startups better understand systems administrators and how to sell to them.

Featured

Subscribe on YouTube