Virtual Infrastructure Performance Management: The Storage View

The emerging class of performance management solutions for high-performance virtualization.

Last time, I discussed the growing need for end-to-end Virtual Infrastructure Performance Management solutions to help datacenter managers successfully virtualize more performance-hungry and business-critical applications with confidence. While no vendor (in my view) has yet to solve the VIPM challenge completely, the pace of innovation has ramped up dramatically in the last few years.

This time, let's look at a few leading vendors that have tackled the VIPM problem from the storage vantage point.

In my interviews with customers across industries, misconfigured or inefficient storage is cited as a major contributing cause of application performance problems (and management headaches) in many highly virtualized enterprises. This makes sense, of course: The vast majority of virtual servers (83 percent as of our most recent survey) are deployed on shared storage, primarily on Fibre Channel and iSCSI SANs, but increasingly on NFS filers as well.

Virtualization's Impact on Storage Performance
Highly consolidated virtual workloads make a mess of the I/O patterns generated by both applications and the operating systems that host them. And, they test the limits of existing troubleshooting and diagnostic tools, designed as they were for static and predictable I/O patterns.

This "I/O blender" is especially pronounced in desktop virtualization projects and when performance-sensitive apps such as OLTP systems start moving around among virtual hosts. In practice, a virtualized application at any given moment is actually much more a storage element than a server element. Indeed, storage in a virtual environment represents the current state of a workload more than any other infrastructure component. So, storage performance can tell us quite a bit about virtual infrastructure performance.

By abstracting workloads from compute instances, we've moved quite a bit of processing from the compute to the storage tier, and along with it our management processes (cloning, replication, migration and protection functions, to name a few). Plus, with storage costs often exceeding 60 percent of the total cost of a virtual infrastructure, there's simply no room anymore for a bit of wasted capacity or throughput.

Taken together, virtualization's features crank up the need for predictable and reliable storage performance just as they break down our established methods of delivering it!

Storage Management Experts Step Up to the Plate
Akorri (part of NetApp since January) was one of the first independent storage management vendors to recognize this. They built instrumentation for a wide variety of mostly midrange SAN arrays, from multiple vendors, to extract deeper metrics than what most storage platforms provided by default. But just gathering more data in a virtual infrastructure isn't enough--in fact, it can lead to more confusion and more overhead. Which data matters at any moment? Which contention point or bottleneck is the important one right now?

So, on top of their instrumentation, Akorri developed analytics to determine and report on overall infrastructure response time. This shift is important, because it's nearly impossible to manually derive an accurate virtual infrastructure performance picture from a bunch of disk, array, or storage switch resource monitors alone.

At the same time, Virtual Instruments was retooling its leading FC SAN performance management product for virtual workloads. Virtual Instruments' VirtualWisdom delivers rich infrastructure response time analytics to rival Akorri's, but goes significantly deeper into the causes of storage latency and measures storage performance in real time.

VirtualWisdom goes deeper than any competitive solution I've seen, using physical traffic access points (or TAPs) to collect enhanced metrics (queue depths, for example) and locate hard-to-find sources of contention directly from the storage network itself--with visibility into every fibre channel frame and every transaction. This level of insight is critical: in demanding virtualized infrastructures, it's often not enough to poll software agents installed on each component of the SAN at 5 to 15 minute intervals to get to the bottom of a tricky performance problem. Real-time, wire-level visibility is required.

VirtualWisdom pinpoints the source of an I/O bottleneck and identifies specific elements for further troubleshooting (an overloaded switch port or a misconfigured HBA, for example). VirtualWisdom also includes a DVR-like performance recorder to simplify historical performance analysis, and a modeling interface to identify hardware degradation before an application performance problem surfaces.

In my discussions with users, VirtualWisdom truly shines in large, complex and performance-sensitive FC SANs, such as those supporting Tier-1 enterprise business applications (real-time trading, order processing, and ERP systems, for instance). VirtualWisdom has become an indispensable tool for organizations as they virtualize these Tier-1 business apps, and the combination of deep, real-time visibility and rapid diagnosis sets Virtual Instruments apart in the VIPM category.

Virtualization Management Vendors Add Storage Visibility
Solarwinds' Hyper9 virtualization management suite was also extended last year with storage insight designed for virtual server admins. Storage Lens indicates latency, throughput and IOPS across multi-vendor and multi-platform (FC, iSCSI, NFS) shared storage platforms at the VM, host, cluster, and datastore level. For the virtualization admin with limited storage experience (and a moderately complex environment), Hyper9 can help locate storage problems at a macro level. Detailed troubleshooting still requires additional storage expertise.

These are just three examples, and your mileage will vary, but they illustrate the range of visibility, analysis, and diagnostic solutions available to keep shared storage running efficiently and reliably for virtual workloads. Every hypervisor vendor and a growing list of add-on virtual machine management solutions also provide at least rudimentary storage visibility.

I'm eager to hear from you: who do you turn to when storage problems degrade the performance of your virtualized applications?

