First Impressions: Marathon everRun HA
High availability for Windows Server gets an assist through a virtual copy scheme.
- By Peter Varhol
For many enterprises, application and server uptime is essential.
In many cases, downtime means lost revenue, lost opportunities and lost customers.
But uptime can be expensive. It typically means special hardware with redundant power supplies, processors, memory, storage (typically RAID) and other features. It often requires modifications to or special versions of the operating system, and some knowledge by the application of the underlying availability scheme.
everRun HA solution from Marathon Technologies Inc. offers a unique approach to high availability. It combines virtual images with technology that enables two appropriately configured servers to run the same application at almost the exact same memory and processor state. The real advantage of this approach is that it utilizes standard hardware and unmodified versions of Windows Server, rather than more expensive custom hardware and software (it does require the installation of a separate Ethernet connection between the machines).
The Two Shall Become One
Consider two servers in a production IT environment. These servers are each running identical software configurations, including the OS and applications. In fact, the disk images are identical. The hardware configurations may be similar, but there's no requirement that they be exactly the same. In addition to the standard network interface, there's also a private Ethernet connection directly between the two machines.
From an external point of view, these servers are performing identical tasks, running the same application image. One of these servers -- the primary -- is running the applications in real time. The other is the secondary server. All disk writes are mapped to both servers. If there's a device failure on the primary server, everRun copies the contents of memory and the processor state over to the secondary server using the private Ethernet connection. Together, these two servers make up a single virtual server, which is what the application user sees. The user logs on to one application, and reads and writes data located in one place. everRun handles the simultaneous access to both physical servers.
I didn't get the chance to literally pull the plug on the primary server, but Marathon tells me it will still work in that manner. Unless you're on a UPS, you'll almost certainly lose the memory and processor state in that case, but your disk image and database state will remain identical.
Marathon says that Microsoft considers this configuration a single installation of Windows Server 2003 because it is in effect the same image, doing the same thing. This means you need only a single license of Windows Server in order to run the two identical images.
Because you're using largely off-the-shelf hardware, the only thing you have to install on standard servers is the private Ethernet connection between two servers. The rest is done by the software.
Using the everRun software is easy. You can log on and obtain a console that shows the details of the hardware configuration (see Figure 1). For each device, you see two, one for the primary server and one for the secondary. You can also display a graphic of the two servers with the connections between them, and the storage devices on each box. This window provides the status of each device.
From this console, you can take devices and even entire servers offline. For example, you can select a disk drive, mark it as inoperative, and watch the load immediately shift over to the other server, which has the same IP address. I tried this by pinging the server, and found only a somewhat higher round trip response time on the one or two packets in transit when the switch occurred. Despite the higher response time, no packets were lost, and the load had switched to use the secondary server as the primary, which could then be taken offline to initiate repairs.
If a storage or I/O device goes down, the console displays that information, but there's nothing the administrator has to do in order to engage a switchover. If the software detects a failure, the state will automatically be transferred to the secondary server, which then becomes the primary image. If the failure occurs on the secondary server, everRun takes no action regarding the image, but notifies the administrator. This happens without interruption for the end user, who will still have normal access to the server and all apps on that server.
The ability to switch processing transparently between primary and secondary servers has additional benefits. You can take the secondary server offline and apply OS, database or app patches, and test extensively to ensure that the patches work as advertised (see Figure 2). Then you can reconnect to the primary server and use the console to make the secondary the primary. Because the virtual images are now un-synchronized, the new primary copies its image over to the secondary, and you've just done a successful and risk-free upgrade of the platform.
Working with XenServer
By the time this appears in print, Marathon will also offer an alternative that does the same thing with virtual machines (VMs) running on the Citrix XenServer hypervisor. XenServer is the first VM platform Marathon is supporting, hoping to capitalize on the popularity of this open source solution. The company plans support for VMware and Hyper-V in the future. You can run two VMs in a high-availability configuration on a single box (although that doesn't give you any hardware redundancy), or multiple VMs connected via virtual images across two different servers. Again, the only requirement is the extra Ethernet connection.
|Figure 1.The everRun console shows two physical servers that have been virtualized into one virtual machine. Two sets of devices -- hard drives, NICs, keyboards, etc. -- are shown in the console, but they appear on the network as a single server.
everRun HA works with standard and unmodified Windows Server 2003, running on standard off-the-shelf x86 Intel and AMD servers. It doesn't require special cluster-aware apps or any sort of shared storage, such as a SAN. You can run it on a simple and readily available IT infrastructure.
|Figure 2.When one of the servers in the two-server configuration goes down, the other server takes over immediately, with no downtime for admins or end users.
Marathon also offers a full fault-tolerant solution, everRun FT. everRun FT requires identical servers and configurations, with identical images, and all processing on the two servers occurs in lock-step. Any part of the primary server could go down, and the private Ethernet connection would simply notify the secondary that the primary was no longer reliable. The disk image, memory and processor state of the secondary are already identical, so nothing else is required. everRun products provide a unique way to achieve availability and fault tolerance through the use of virtual images and hard-wired communications between servers. Its approach lets you use standard hardware, and, in fact, possibly hardware you already have. The results are impressive in their performance and response time in case of a failure. If you're looking for high availability, this is a lower cost than many traditional solutions. And the technology is way cool, too.
Peter Varhol is a principal at Technology Strategy Research LLC, an industry analysis and consulting firm.