Rogue VMs -- Be Very Afraid
Unmanaged virtual machines have become an increasing threat to many organizations. Is your infrastructure safe?
I often get asked about the threat of rogue virtual machines (VMs) to production virtual infrastructures, and from my experience, the greatest source of rogue VMs has long been user desktop systems.
If you're not familiar with the term, a rogue VM is essentially an unmanaged virtual machine connected to the organization's network.
In essence, a rogue VM is no different than a rogue physical server, with the exception that many organizations have policies in place to prevent the introduction of rogue physical servers into the IT infrastructure, but either have no policy in place or do not enforce existing security policies to prevent users from connecting rogue VMs to the LAN.
How does it happen? Typically, rogue VMs appear on user desktop systems more than anywhere else. In many organizations, users are allowed to run any of the following virtualization applications locally on their desktops:
- Microsoft Virtual PC
- VMware Player
- VMware Server
- VMware Workstation
Once a user has a virtualization application installed, he is free to download pre-configured virtual machines from the Internet or to copy VMs locally to his system using removable media such as a USB thumb drive. Users enjoy the freedom to run VMs locally because virtualization allows them to more easily conduct tests and train on new systems. While that's great, problems arise when users connect unpatched or unmanaged VMs directly to the company LAN. Keep in mind that most downloadable virtual appliances are not at the most recent OS and application patch levels, and very few have any pre-installed anti-virus protection. To relate the problem to the physical world, consider a policy that allowed users to build white box servers at home, bring them to work and connect them directly to the LAN.
Granted, traditional layer 2 network isolation methods such as VLANs will help, but alone they aren't enough. VMs are typically connected to the LAN using one of two methods:
- Bridged networking
- Network address translation (NAT)
Physically and logically, both connection methods appear the same, as shown in Figure 1.
[Click on image for larger view.]
|Figure 1. Virtual machines are normally connected to the LAN via bridged networking or NAT.
A VM that connects directly to the LAN via bridged networking will have its MAC address and IP address fully visible on the LAN. Bridged networking got its name because when enabled, the physical network interface card (NIC) behaves like a layer 2 bridge, and forwards traffic from the VMs unaltered to the LAN. Organizations with sufficient layer 2 security mechanisms in place, such as VLANs (MAC filtering is another alternative, although not as often used), will be able to prevent rogue systems from accessing LAN resources.
Security isn't as straightforward if a user connects VMs to the LAN with network address translation (NAT), which would allow a VM to connect to the LAN via the physical system's physical network interface. NATed VM traffic would appear on the public LAN as originating from the physical system, and hence would assume the physical system's MAC address and IP address. The virtualization application's NAT driver would perform all of the necessary address translation, along with offering DNS and DHCP proxy services. So with NAT, it's very easy for an unmanaged system to access resources on the company LAN.
Why is that a big deal? Consider the following potential headaches that rogue VMs could cause:
- A test DHCP server accidentally connected to the LAN could assign bad IP addresses to DHCP clients on the LAN, effectively disconnecting the clients from LAN resources.
- A user with administrative rights in a VM's guest OS could install malicious software for the purpose of scanning network resources, collecting information or launching internal attacks.
- An unpatched system exposed to the LAN could introduce malware to the network. Many of us suffered from the SQL Slammer worm, which did not harm SQL servers with up-to-date patch levels.
If you see rogue VMs on user systems as a threat to your organization, there are a few things you can do about it:
- Move all test and training VMs to managed servers and prevent users from running virtualization software locally on their systems.
- Disable the virtual network interfaces that allow VMs to connect to the LAN through their host system.
- Aggressively audit systems for the presence of rogue VMs.
Ideally, testing and training VMs should be centrally managed in the data center. This allows the IT staff to easily track VMs and patch them as needed. In addition, compliance auditing is much easier; tracking licensing compliance, for example, isn't an easy task when the types of VMs users are running on their systems are an unknown commodity. Users will also see a benefit from server-based lab and training-system hosting. For example, with tools like VMware Lab Manager and VMLogix LabManager, users could create snapshots of running virtual environments and share environments with other users. All that's needed to remotely access a virtual lab is a Web browser, so user access setup is easy. Browser-based access also allows users to potentially access their virtual lab environments from anywhere.
In some cases, investing in additional servers, storage, virtualization and lab management software may not be an option, and users will need to continue to run VMs locally on their systems. In such cases, you still have options for preventing those VMs from accessing the network.
As a first step to prevent VMs from accessing the physical LAN, you should disable on all client network interfaces both the VMware Bridge Protocol and the Virtual Machine Network Services (depending on installed virtualization software). Disabling the VMware Bridge protocol, for example, disables bridged networking support, thus preventing VMs from using bridged networking to connect to the physical LAN.
I also recommend disabling the following two virtual network interfaces installed by VMware on physical host OSes:
- VMware Virtual Ethernet Adapter for VMnet1
- VMware Virtual Ethernet Adapter for VMnet8
Disabling VMnet8 prevents VMs from using NAT to communicate with the physical LAN. By disabling VMnet1, VMs will not be able to communicate with the physical host system via the VMnet 1 virtual host-only network interface. At that point, VMs would be able to communicate with each other, but would have no network connectivity to the physical world.
Auditing Your User Systems
In addition to locking down user systems to prevent them from connecting unmanaged VMs to the LAN, I also recommend aggressively auditing user systems for the existence of rogue VMs. There are a few ways you can do this. One method would be with my domainvhdaudit.vbs and vhdaudit.vbs scripts, which can be found here. Domainvhdaudit.vbs will search all computers in an Active Directory domain and generate a list of virtual hard disk files (such as .VMDK, .VHD) found on each computer. In addition, the script will look for large files and by default will display any file larger than 800MB-the minimum file size that the script will list is configurable. The vhdaudit.vbs script provides the same output information as domainvhdaudit.vbs, with the exception that the script runs locally on and collects information from a single computer.
Some third-party vendors have begun to offer auditing tools capable of auditing and tracking VMs across both client and server systems, with FortiSphere being the first. I'm expecting other vendors to follow suit and begin to offer similar solutions this year.
In my line of work, I get to talk to plenty of virtualization architects, managers and administrators. While some IT folks have very good practices already in place for dealing with rogue VMs, others are just now starting to give the issue some thought. If you're part of the latter, hopefully you'll find some of my suggestions helpful.
Chris Wolf is VMware's CTO, Global Field and Industry.