Best Practices for Managing Your EC2 Snapshots on AWS Cloud
Aside from third-party solutions, snapshots are the best option for backing up your EC2 virtual machines, says Brien Posey, who explains how managing the process is something of an art form.
Unless you are using a third-party backup solution, snapshots are the best option for backing up your EC2 virtual machines. Although Amazon makes it incredibly easy to create EC2 snapshots, managing snapshots is something of an art form. Create or retain too few snapshots and your instances could be left unprotected. Create and maintain too many snapshots, and you will end up incurring needless costs. Additionally, there are various logistical issues to consider.
One of the first things that I recommend doing early on is to decide whether you want to create snapshots on a per-volume or on a per-instance basis. AWS allows you to do either, and if you create multiple lifecycle policies you can use a mixture of instance and volume snapshots. Personally, I prefer to create snapshots on a per-instance basis, because that eliminates the hassles of managing individual volumes. Even so, there are definitely use cases that warrant the creation of volume-level snapshots.
Another best practice is to take some time to figure out what resources need to be protected through snapshots. If you have a significant number of virtual machine instances, you will probably find that some instances need frequent protection, while other instances can be backed up less often. Transient instances typically don't need to be backed up at all.
Once you have a feel for what needs to be backed up and when, the next thing that I recommend doing is to create a tagging taxonomy that can be used in implementing your backup plan. When you use the EC2 Data Lifecycle Manager to create a snapshot lifecycle policy, you will have to use tagging as a way of associating that policy with the various instances that you want to protect.
For those who might not be familiar with tagging, tags are key/value pairs, and can be applied to nearly any object that you might choose to create in the AWS cloud. The keys and values that you associate with a data lifecycle policy should be designed to mesh with any other tags that you are already using within AWS. If you're not yet using tags however, you might consider creating a tag called Backup. From there, you can associate values with the Backup tag that reflect the frequency with which a given instance should be backed up. For example, you might use values such as Daily, Hourly, or Never. Of course you are free to use your own values, you don't have to use the value names that I have suggested.
To give you a more concrete example of how this works, take a look at Figure 1. In this figure, you can see step No. 5 of the virtual machine instance creation process. As you can see in the figure, I have defined a tag called Backup, and I am assigning that tag a value of Daily. Incidentally, you aren't limited to assigning tags during the instance creation process. You can always go back and add a tag to an instance later on.
Now, take a look at Figure 2. This figure shows the screen used to create a snapshot lifecycle policy. As you can see in the figure, I have begun creating a policy that I am calling Daily Backup. This particular policy is targeting instances with the Backup: Daily tag. I have also configured a policy schedule that allows one snapshot to be created each day.
The nice thing about the way that this process works is that you don't have to worry about adding new EC2 instances to a backup job. All you have to do is apply the appropriate tag to new instances as they are being created. The snapshot lifecycle policy will recognize the tag, and backup the virtual machine instance accordingly.
One last thing that I want to mention is that when you create a snapshot lifecycle policy, you are given the opportunity to define a retention type. This allows you to control the number of previous snapshots that are retained. EC2 is then able to automatically delete old snapshots.
You can set the retention type to either Count or Age. Setting the retention type to Count allows you to retain a specific number of snapshots. For example, you might want to keep the five most recent snapshots of each virtual machine instance. Conversely, setting the retention type to Age allows you to automatically purge snapshots once they reach a certain age and are no longer of use to you. For instance, you might choose to automatically delete any snapshot that is more than a month old.
Regardless of how you choose to structure your snapshot lifecycle policy, the important thing is to make sure that you are using snapshots (or an alternative mechanism) to protect your EC2 virtual machine instances.
Brien Posey is a 16-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.