In-Depth
Managing Your Virtual Network
Monitoring, migration and quality of service are some of the major challenges waiting in the world of virtual networking.
Over the past few years, the exponential growth in the popularity of server virtualization has forever changed the way that IT operates. What made server virtualization so successful? Virtualization software creates a layer of abstraction between the hardware and the operating system. As such, organizations no longer have to purchase expensive, special-purpose server hardware. Instead, they can invest in low-cost hardware that can be treated as a pool of resources, and then assign those resources to the individual servers on an as-needed basis.
By decoupling the OS from the hardware, virtualization drives down hardware costs and provides organizations with an unprecedented degree of flexibility. Today, many organizations are beginning to realize that these same concepts can be used to create virtualized networks. Network virtualization effectively disconnects the network from the underlying, location-specific network hardware.
Of course, virtual networks are nothing new. Virtual LANS (VLANs) and virtual private networks (VPNs), for example, have been around for at least a decade. Although these technologies are still in use today, modern virtual network solutions tend to be more granular in nature and far more flexible.
As network virtualization continues to evolve, it's becoming apparent that virtual networks present certain challenges that are simply not an issue on purely physical networks. It's these challenges that are shaping the latest virtualization trends.
Monitoring Difficulties
One of the most widespread issues in regard to virtual networking involves the difficulties of managing and monitoring network segments that are purely virtual. There are a couple of different factors contributing to this difficulty.
One such factor is everyone wants to receive the maximum ROI on their server hardware, and conventional wisdom tells us that the way to do this is to host as many virtual machines (VMs) as possible on the server hardware. That being the case, server manufacturers are designing servers that can accommodate numerous VMs. IBM Corp., for example, recently launched its Power 7 processor, which has eight individual cores. When used in a 4-CPU configuration -- which will be the case with the forthcoming Power 750 Server -- this processor will provide a total of 32 cores, capable of processing 128 simultaneous threads. Estimates as to how many VMs the IBM Power 750 will be able to host vary widely, but even the most conservative network administrator should be able to run at least a couple dozen VMs on the box.
There's no denying that a collection of a couple dozen VMs could potentially produce a lot of network traffic. Prior to the proliferation of VMs in the data center, it was common practice to create a dedicated backbone segment between servers. This backbone segment helped to conserve network bandwidth by isolating inner server communications to the dedicated backbone segment.
Today, this same concept is being applied in the virtual data center. When a host contains large numbers of VMs, there's little question that those VMs will need to communicate with each other to at least some degree. Because all of the VMs are hosted on a single physical server, it seems silly for the host to send packets out to the physical network, when those packets will end up right back where they started (although they'll ultimately be destined for a different VM). Instead, packets can be sent directly between VMs by way of a virtual switch. Herein lies the problem.
Virtual servers are subject to exactly the same types of security threats as physical servers are. However, when certain types of exploits occur on a physical network, they'll usually be detected by an intrusion detection system (IDS). But an IDS is completely ineffective if the traffic is isolated to the virtual network, because the offending packets never pass through the IDS.
Just as it's difficult to detect malicious traffic flowing across virtual networks, it's also difficult to monitor network traffic. Packet sniffers simply can't see traffic flowing across an isolated virtual network segment.
There are a couple of different approaches to dealing with these types of issues. Some organizations configure one VM on each host to act as a security appliance. This VM's job is to monitor traffic flowing across virtual networks, and to take action if any malicious packets are detected.
Another approach to dealing with isolated virtual network segments is to use security software that's aware of your virtualization platform. VMware Inc., for example, allows security vendors to integrate their wares into vSphere through a technology called VMsafe.
VM Migration Issues
Prior to the widespread use of server virtualization, hardware maintenance was a major issue for many organizations. Most types of hardware upgrades or repairs require the server to be taken offline, but downtime is unacceptable for some applications. Clustering helped with this problem to some degree, but clustering increases costs and complexity, and there are some applications that can't be clustered.
In many ways server virtualization is an answer to prayers, because the technology allows a VM to be moved to a different host whenever the current host needs to be taken offline for maintenance. In spite of the very welcome benefits that VM mobility offers, though, changing a VM's location can create a whole slew of new issues that must be dealt with.
One of the more prominent issues involves the inability of other network nodes to locate the VM once it has been moved. Post-move communications with a VM aren't usually an issue if the VM is relocated to another host that resides on the same network segment as the previous host. However, if the new host is located on a different network segment, then the VM's IP address may not match the subnet used by the rest of the segment's nodes, resulting in an inability to communicate with the VM.
One common solution to this problem is to create a VLAN. VMware ESX allows virtual switches to be segmented into VLANs. By doing so, it's possible to ensure communications with all virtual servers regardless of their physical location on the network.
Another major issue centering around VM migrations is that it can be tough to keep track of where VMs currently reside. Suppose, for instance, that a port on a network switch failed. In the past, a port failure would typically cause a single device to become inaccessible. In a virtual data center, though, the server that's connected to the failed port might be hosting multiple VMs.
In a situation like this, the helpdesk phones would likely begin ringing off the hook with users reporting a number of seemingly random problems. The key to making sense of the helpdesk calls and resolving the issue in a timely fashion is to know that all of the affected VMs reside on a single host, and are bound to a common network adapter.
At one time, network administrators could keep track of servers by using an Excel spreadsheet. Today, the highly dynamic nature of virtual data centers makes manually keeping track of VMs nearly impossible. Therefore, it's critical for organizations to invest in management software that can keep track of each VM's location in real time.
Network Bottlenecks
Another serious problem involving virtual networking is that sometimes virtual servers have to communicate with the physical network. This is accomplished by making use of one of the host server's network adapters.
For a long time physical network communications for VMs weren't much of an issue. After all, first-generation VM deployments rarely consolidated more than a handful of servers onto a single host, and a single, 4-port Ethernet adapter could usually handle the demands of such a deployment.
Today, a network adapter can easily become a bottleneck. Imagine for instance that you have a server hosting 24 VMs. Assuming that the server has three expansion slots, you could install three 4-port Ethernet cards, giving you a total of 12 Ethernet ports.
In some situations, sharing 12 physical network ports among 24 VMs would work out well, as it gives the server a port for every two VMs. Assuming none of the servers are running bandwidth-intensive apps, this should be more than adequate.
So what's the problem with this type of configuration? In some cases it may violate documented best practices. This is especially true if you're running VMware ESX. VMware recommends that you dedicate a 1GB Ethernet adapter to each VM. If you were to strictly adhere to this recommendation, then the server would only be able to host 11 VMs (reserving one Ethernet port for the host), even though the server has sufficient CPU resources to host far more virtual VMs.
In the situation I described, the lack of network adapters directly impacts the number of VMs the server can host, which in turn has a direct impact on your ROI for the server. Imagine telling your boss that you won't be able to use the server to its full potential because you can't install enough network adapters to facilitate the number of VMs you'd planned to host.
One solution to this problem is to install a 10GB network interface card on the server. You can regulate bandwidth consumption by creating a vSwitch -- VMware speak for a virtual network switch -- and then limit bandwidth at the switch level. This forces the VMs to treat your 10GB network connection as a series of 1GB connections.
Assuming the server in question has three free Ethernet ports, this solution would provide the server with 30GB of available bandwidth, which should be more than adequate for servicing the two-dozen VMs. Even if the server only had one free port, this approach could still end up being a viable option.
The vSwitch approach makes each VM treat the 10GB port as if it were a 1GB port. If each server needs to have 1GB of bandwidth available to it, then it would be easy to assume this approach would limit the server to a maximum of 10 VMs. (Keep in mind we're using a virtual switch.)
The virtual switch treats each VM as if it has a 1GB network adapter installed, but the virtual switch is actually connected to a 10GB adapter. Because it's unlikely that any of the VMs are saturating their available bandwidth, the odds are good that the 10GB connection will be more than adequate to service all of the VMs. Of course, every deployment is different, so benchmarking and capacity planning are critical if you're considering using this approach.
The QoS Approach
Last year I spoke at a virtualization conference, and after my session, someone came up to me and explained that they were about to begin consolidating some of their servers. He went on to ask me what I thought about using the Windows Quality of Service (QoS) feature as a way of preventing any single VM from consuming an excessive amount of the server's available network bandwidth.
At first, setting a QoS policy for each VM sounds like an ideal solution. After all, QoS is designed so that a server can reserve bandwidth, and so that bandwidth can be prioritized for various applications. Any bandwidth that has been reserved is available on demand, but if a server isn't using the bandwidth it has reserved, then that bandwidth is freely available to other network nodes.
The problem with using QoS to shape traffic for VMs is that the VMs don't make direct use of the network card. Instead, I/O requests are passed to a virtual switch that redirects the I/O requests through either the host OS or through the hypervisor (depending on your configuration). The point is, the individual VMs have an incomplete picture of how the server's networking hardware is being used. Unless the virtualization platform you're using is QoS-aware, QoS won't be an effective mechanism for shaping VM traffic.
As you can see, there are both benefits and challenges that are unique to virtual networks. Fortunately, virtual networks are beginning to mature, and there are viable solutions to the various inherent challenges.