Disaster Recovery in the Cloud: Developing a Plan for Success
BDR in the cloud takes some new thinking -- and understanding of the nuances that make that plan different from what you know in the physical world -- and lots more testing to ensure a recovery goes right. First in a series.
Low capital, easy deployment and the instant-on nature of cloud services make it a possible target for an easier disaster recovery, but can you take advantage of the cloud for your services? Of course you can, but you need to be willing to plan. In certain ways, cloud DR isn't much different from a normal off-site DR location. You have to consider the implications of backup, how to bring machines online, the networking implications, database replication and prioritizing your applications. Here are some of the main considerations to keep in mind:
The Differences in Server Instances
The differences in server instances can be numerous. First, you are not comparing apples to apples when looking at performance and server capacity. The resources you get with a cloud server instances are estimated. Instead of being able to say, "We need another duplicate quad-core server with 32GB of RAM," you'll need to do sufficient testing to derive a good estimate in the cloud. That's because you will need to choose between vague descriptions like a small instance or a large instance when determining what your needs are.
The Application Infrastructure
You'll also need to consider how your applications will react when you stand them up in a cloud environment. For general cloud providers like Amazon and Rackspace, you also have to take into account the general public nature of their cloud deliveries and the specific support you can expect for those systems. For example, Amazon expects a modern software-oriented architecture implementation using Web services and n-tier applications. This kind of requirement will throw your older client/server applications for a loop and will not be supported in these environments.
As with any DR, replicating critical pieces of your infrastructure is important. For example, many internal applications rely on your authentication system. Consider how you can extend your infrastructure into the cloud. Several vendors are offering ways to extend your network into their cloud, so you can use your own IP addressing scheme, authentication, and security.
In addition, some vendors are finding other ways to extend your internal authentication requirement. Microsoft's Active Directory Federation Services is one example that allows you to extend what is traditionally an internal authentication method out to the Internet and other private networks, or other third-party systems that integrate with your Active Directory servicing the cloud. Restoring an authentication infrastructure is rife with pitfalls, so keeping it replicated to your cloud is preferred, especially considering how essential it is to other systems.
How to Get Servers to the Cloud
Backing up is also another area that requires some thought. Traditional tape-based backup is likely not going to do much for you in the cloud, since the handholding involved in that type of implementation won't be available through any cloud provider.
Many cloud providers have integrated virtualization platforms that allow you to snapshot your systems and ship them over the Internet to the cloud provider. The ease of spinning up a fully configured system that was live in your datacenter is a plus, because it means fewer configuration issues when comparing it to traditional cloud site builds. This is the reason it pays to have your environment virtualized, including critical systems that may have resisted virtualization. Although you can convert physical systems, it's certainly more difficult and is finding less reason to be necessary versus just virtualizing those servers to be ready for an easy move to another hypervisor-compatible cloud DR solution.
What About the Data
Critical data is another matter. You can't just take a snapshot of your active databases. Migrating virtual machines is more appropriate to servers that push Web pages or files. Transactional databases require special handholding, and it's essential that you replicate the data to the cloud based on your recovery point objectives.
If your critical databases are designed to have a single database server, it's time to consider retooling your database tier. Most big database implementations have high-availability solutions to push data across distance, and you'll need to investigate these to ensure you don't suffer data integrity issues that could sink your DR effort. Without good data, there isn't much reason to recover those application servers up in the first place.
Finally, don't forget your network needs. When you are working with a cloud vendor, you don't have the ability to just log into your switches and routers to make changes. You'll need to learn the management and configuration solution the cloud company provides. You may want to consider load balancers that are designed to throw traffic across to your cloud instances when it's time to redirect traffic; bringing further automation to your DR recovery.
Next time, we look at how to prioritize your infrastructure for disaster recovery.
Eric Beehler currently has certifications from CompTIA (A+, N+, Server+) and Microsoft (MCITP: Enterprise Support Technician and Consumer Support Technician, MCTS: Windows Vista Configuration, MCDBA SQL Server 2000, MCSE+I Windows NT 4.0, MCSE Windows 2000 and MCSE Windows 2003). He has authored books and white papers, and co-hosts CS Techcast, a podcast aimed at IT professionals. He now provides consulting, managed services and training through his co-ownership in Consortio Services LLC.