In-Depth

In Virtualized World, Storage Options Just Get Better and Better

Storage technology improvements -- both in hardware and software -- are making simple work of pooling and provisioning storage for use in the virtualized world.

While there's no doubt having the proper storage is important, knowing what type of storage to use and how to provision that storage can be mind-boggling. Virtualization administrators know all too well the impact storage has on VMs. The storage infrastructure impacts everything from a virtual server's capacity to its performance. Even VM fault tolerance can be directly dependent on the underlying storage.

Hypervisors such as VMware vSphere and Microsoft Hyper-V can use a number of different storage types, each with its own unique advantages and disadvantages. This article examines some of the storage options available to you as a virtualization administrator, as well as the pros and cons of each option.

Solid-State vs. Mechanical
Regardless of the storage architecture you select, you'll have to choose between traditional mechanical hard disks (such as SATA or serial-attached SCSI [SAS]), solid-state drives (SSDs), or some combination of the two.

Solid-state storage is far more expensive than mechanical storage. Prices fluctuate, but the cost per gigabyte is often at least 600 percent higher than that of traditional storage. Furthermore, SSDs have far lower capacities than mechanical disks. The largest enterprise-class SSD major manufacturers offer is usually 500GB or less.

On the plus side, although SSDs have limited capacity and are far more costly than their mechanical counterparts, they offer much higher performance. This is especially true when it comes to random data access. Mechanical disks have to move read or write heads across the disk's surface in order to read or write data. Head movements take time to complete. SSDs don't have any moving parts, so read and write operations are much quicker than on a mechanical disk. SSDs also use indexes to quickly locate storage blocks, which further enhances performance.

There are reasonably priced consumer-grade SSDs, which can be confusing. For example, some manufacturers offer SSDs with up to 1TB of storage for less than $1,000. Such disks offer far greater capacity at a much lower price than what's available from enterprise hardware vendors.

Consumer solid-state storage is so much less expensive than enterprise-grade SSDs because there's more than one type of flash memory you can use. Enterprise-grade SSDs use single-level cell (SLC) memory. This type of memory offers high performance, high endurance and low power consumption.

In contrast, consumer-grade SSDs use multi-level cell (MLC) memory. MLC memory delivers a much lower cost per gigabyte and a greater overall capacity, but in doing so sacrifices speed and endurance.

The high cost and low capacity of enterprise-grade SSDs make them unsuitable for general-purpose data storage, except in special cases. However, SSDs can be beneficial for tiered storage and for VMs that require the highest possible level of input/output operations per second (IOPS).

Direct-Attached Storage (DAS)
Prior to Windows Server 2012, I wouldn't be discussing DAS here. The reason is simple. In a physical datacenter, a server failure is usually regarded as an inconvenience. In a virtual datacenter, a server failure can be catastrophic. Host servers run multiple virtualized workloads. If a standalone host server were to fail, all of the virtual servers running on the host would also fail. The result would be a major outage.

The only way to keep this type of nightmare scenario from happening is with failover clustering. Until relatively recently, though, failover clustering required shared storage. In other words, the cluster nodes all had to be able to access a common LUN or a common physical storage device. This was true for both Hyper-V and for vSphere.

When Windows Server 2012 Hyper-V came out, Microsoft changed the requirements for failover clustering so that cluster nodes could use their own DAS instead of being attached to a Cluster Shared Volume (CSV). By doing so, Microsoft not only made clustering more financially feasible for its small and midsize business customers, it also provided a way to migrate a running VM from one Hyper-V host to another without using shared storage.

You can perform these types of live migrations within a cluster or outside of a cluster. For instance, you could live migrate a VM from a standalone host to a cluster node or from a node in one cluster to a node in another.

Not to be outdone, VMware introduced similar functionality in vSphere 5.1 with vMotion. With vMotion, you can now migrate a running VM from one host to another without needing shared storage.

Even though you can now live migrate VMs between hosts without shared storage, you shouldn't automatically dismiss the idea of using shared storage. For clusters whose nodes exist within a single physical datacenter, it's best to use shared storage if possible.

Shared storage does come at a price. Thankfully, both Microsoft and VMware provide ways of creating shared storage without requiring an investment in SAN hardware.

Microsoft's solution involves creating an iSCSI SAN. Windows Server 2012 includes an iSCSI target component, which lets you turn DAS into iSCSI-accessible storage. Because you can share iSCSI targets, you can build a Hyper-V cluster around the use of iSCSI.

VMware takes a slightly different approach to low-cost shared storage. VMware offers a product called the vSphere Storage Appliance (VSA). This lets you treat a physical server's DAS as a shared storage resource.

Storage-Area Networks (SANs)
Both Microsoft and VMware let you create a low-budget SAN through software-enabled shared storage. Although this approach is sometimes referred to as a SAN, there are major differences between software SANs and hardware SANs.

Hardware SANs are almost always based on Fibre Channel components such as host bus adapters (HBAs) and Fibre Channel switches. The components in a hardware SAN are typically arranged in a way that makes the storage infrastructure tolerant of component-level failures. For example, you can use redundant Fibre Channel switches to provide multiple paths to physical storage devices, which helps ensure connected storage will remain accessible even in the event of a switch or cable failure.

Another big difference between hardware- and software-based SANs is that hardware-based SANs use the storage hardware to do the heavy lifting. In fact, the Windows OS is beginning to take advantage of hardware-level capabilities within a SAN.

To show you how this works, imagine you have a server running Windows Server 2008 R2 and need to copy some data across your SAN from one LUN to another. Windows Server 2008 R2 would likely transfer the data from the source storage to the Windows Server and then to the destination storage. In other words, the OS would be involved in the copy process.

Windows Server 2012, on the other hand, is designed to use hardware-level capabilities when performing such operations. A feature called Offloaded Data Transfer (ODX) lets you copy data directly from the source to the destination without passing through the server. Instead, the ODX feature instructs the SAN hardware to perform the copy operation.

Direct Storage Mapping
In most cases a SAN (software or hardware) acts as a repository for VHDs. Connectivity to the SAN is established at the host level, rather than the VM level. However, this doesn't mean you can't connect a VM directly to physical storage. Sometimes when you need to use native SAN functionality or even guest clustering you need to link a VM directly to physical storage.

The method for accomplishing this type of storage mapping depends on which hypervisor you're using. vSphere has a feature called Raw Disk Mapping, which can present a LUN directly to a VM. You can connect this LUN through Fibre Channel, Fibre Channel over Ethernet (FCoE) or iSCSI.

Hyper-V provides similar functionality, but uses two separate mechanisms. The first of these mechanisms is a SCSI pass-through disk. The pass-through disk functionality can connect a VM to a physical hard disk, as long as the hard disk is visible to the host OS and is offline.

In Windows Server 2012, Microsoft introduced a virtual Fibre Channel feature. This lets your VM use a virtual HBA to establish SAN connectivity, instead of relying on pass-through disks. Implementing virtual Fibre Channel simply requires the physical HBA to support N_Port ID virtualization (NPIV).

In case you're wondering, you can live migrate a VM that uses virtual Fibre Channel, as long as you meet some basic requirements. The most important of these requirements is that the destination host must also provide physical Fibre Channel connectivity to the VM storage. The VM must also use two separate address sets (worldwide node names and worldwide port names). These separate address sets let the live migration handoff process occur without losing storage connectivity in the process.

Fibre Channel over Ethernet (FCoE)
Fibre Channel SANs use specialized hardware, but some organizations also make use of FCoE. FCoE involves sending Fibre Channel commands over Ethernet cables in much the same way that iSCSI lets you send SCSI commands over a TCP/IP network.

Although FCoE can be cost-effective, there are misconceptions about its performance. For example, one organization implemented 10Gbps FCoE because it has a higher throughput and a lower price tag than 8Gbps Fibre Channel.

While this might sound logical in theory, it isn't truly an apples-to-apples comparison. FCoE encapsulates Fibre Channel commands into Ethernet packets. This means FCoE has to deal with far more overhead than Fibre Channel hardware. And Fibre Channel hardware can deal with native Fibre Channel commands without the Ethernet encapsulation.

Windows Storage Pools
Windows Storage Pools were first introduced with Windows Server 2012. These were designed to make it easy to virtualize storage. You could add a series of physical disks to a storage pool. Then you could create virtual disks within the storage pool. You could make the virtual disks fault-tolerant if necessary, and use them as DAS, VM storage or even as iSCSI targets.

In Windows Server 2012 R2, Microsoft introduced a new feature called tiered storage. Tiered storage has long been a feature in enterprise-class, hardware-based SANs, but it's new to the Windows OS.

If a Windows storage pool includes both mechanical storage (such as SATA or SAS drives) and solid-state storage, Windows can differentiate between the two. When you create a new VHD, Windows gives you the option of using tiered storage.

If you do enable tiered storage, Windows will automatically keep track of hot blocks and cold blocks. Hot blocks are frequently accessed storage blocks. Hot blocks are dynamically moved to solid-state storage. The idea is the VHD will offer better performance if the most frequently accessed blocks reside on solid-state storage. Essentially, tiered storage treats SSDs as a cache for the most frequently accessed data.

The nice thing about this approach is you can implement storage tiers on a per-VHD basis. As such, you can use storage tiers where they'll be most beneficial. You can also control the amount of solid-state storage each VHD consumes.

The Trend Toward SCSI
One of the decisions you'll have to make during the storage planning process is whether to use SATA or SAS drives. SAS drives are generally preferred in enterprise environments. Both, however, are viable options.

Up until vSphere 5 was released, for example, VMware had limited SATA support. If you attempted to install vSphere on a server equipped with SATA drives, you'd receive an error message stating vSphere was unable to find a supported device to which to write the vSphere image. SATA support was greatly improved with the release of vSphere 5. Microsoft took the opposite approach. Although Hyper-V has always supported SATA and SAS drives, Hyper-V VMs were always required to boot from an emulated IDE controller, even if the underlying storage was SCSI-based.

One of the really big changes that Microsoft is making in Windows Server 2012 R2 is the introduction of Generation 2 VMs. Generation 2 VMs are hypervisor-aware and therefore don't make use of emulated hardware.

Microsoft's removal of support for emulated hardware means the company did away with VM-level IDE support. Generation 2 VMs treat all VHDs as SCSI.

Although it's possible to create a virtual SCSI disk on top of physical SATA storage, Microsoft seems to be sending a message that SCSI is the preferred storage type going forward.

Incidentally, Windows Server 2012 R2 still supports creating and using first-generation VMs. You can only use Generation 2 VMs for VMs running Windows Server 2012 R2 or Windows 8.1.

What About Network-Attached Storage (NAS)?
NAS is another option for both vSphere and Hyper-V environments. Both hypervisors support iSCSI NAS devices. VMware vSphere supports using NAS storage through the Network File System (NFS) protocol. Windows Server 2012 Hyper-V supports connectivity to NAS servers through the Server Message Block (SMB) 3.0 protocol. NAS devices running earlier versions of SMB aren't officially supported.

Storage planning is one of the single greatest considerations you'll have to grapple with when running a virtual datacenter. Your storage architecture can impact everything from VM performance to fault tolerance. It's therefore critically important to evaluate your storage options and choose a storage subsystem that's a good fit for your business needs and your budget.

Featured

Subscribe on YouTube