Tape Storage: What's Old Is New Again
It can even help you survive the coming z-pocalypse.
With an impending z-pocalypse -- not the zombie one, the zettabyte one -- on the horizon, IT pros had better start boning up on their survival skills. Analysts are telling us that by 2020 or thereabouts, we will see upwards of 20 zettabytes of new data looking for a home. And it's going to take some serious rethinking of storage infrastructure and data management to get us through the challenge.
As I've argued in previous posts, there is no way that sufficient capacity will be forthcoming from the flash storage, disk storage or optical storage crowds to store all the bits. Only tape has the capacity, both in terms of media density and manufacturing capability, to shoulder the burden. That said, however, tape has its limitations.
For one thing, tape operations and management skills have been decimated by a widespread and misguided abandonment of the technology since the mid-1990s, when many analysts stated that the technology was pushing up daisies. When SATA disks with perpendicular magnetic recording began appearing in Terabyte capacities for not a lot of money, it appeared that disk to disk replication with de-duplication was poised to eliminate tape backup altogether.
Secondly, tape is designed for linear (not rotational) read/write. That makes it good for storing data with low re-reference rates and extremely rare updates, like archival data, and for a few primary storage applications that involve writing a lot of data fast, like results from a particle collision experiment. But it has not been a preferred storage medium for frequently accessed data with high rates of change and modification. For this reason, you need to do some analysis to integrate tape into your workflow. That is hard to do given all the noise that the flash and disk guys make about their products in these days of the Infrastruggle.
The Microsoft Seal of Approval
But tape's role as an archival medium is just the thing to have if you are to survive the z-pocalypse. Even Microsoft seems to think so. I recently heard a speaker from Redmond talking about clouds "generically" ("I am not allowed to discuss the storage architecture of Azure directly, so please do not misconstrue my words about storage to be a reflection of the strategy we are adopting at Microsoft Azure," blah, blah, blah…) and he laid out a case for using tape to store the data of the masses in the cloud. So far, we have had confirmation of the use of tape by Google, but AWS has not yet confirmed any embrace of tape technology publicly to date.
Tape-based archives do not just happen, however. In fact, archiving itself is not a simple matter. To make a good archive, you need four things:
- Organizing principles or policies that usually identify the data assets to be archived, the storage to which the archived data is to be directed, and the timeframe for migrating data into the archive.
- The archive storage infrastructure itself and the plumbing to move the data into the infrastructure.
- The software process that does the actual data movement and confirms that the movement has been made in a valid way.
- Procedures and automation to maintain the archive in a useable state.
You usually have to cobble together those things yourself. Vendors want to sell you the infrastructure or the software or the plumbing to accomplish the movement and hosting of bits, but nobody really sells the whole shootin' match. One big step usually missing is data classification; you need to know what data assets to move into the archive. This is heavy lifting that is often left up to you.
There are some automation tools. For example, Spectra Logic has been helping folks in media and entertainment migrate video clips directly into archive by providing an agent to video editing applications like AVID's non-linear editing system that will automate the production storage to archive storage of audio/video data assets.
Their Black Pearl technology creates objects from the data, then stores the objects to tape using the Linear Tape File System (LTFS) developed by IBM and submitted to the Storage Networking Industry Association for standardization. LTFS extends your server file system, whatever it may be, to tape in a cool sleight of hand. Putting an object model on top of that capability provides a way to store massive numbers of small files in LTFS (which really seems to be optimized for long block files) using object containers.
Object Storage FTW
However, if you are not in media and entertainment, or in genetics mapping, or surveillance video, or any of the many other industries with very well defined workflows, automatic intervention into the workflow may be a bit more challenging to achieve. For most companies, in fact, data is stored predominantly in the form of end user files that are poorly described and mostly anonymous in terms of the business process they support.
Object storage is a solid solution for this problem. Caringo, for example, provides technology called SWARM, that enables the adding of metadata to all of those files and for storing them in an object repository where the drives they are located on can be powered down when data is archived. Spectra Logic just announced another platform, ArcticBlue, which leverages shingled magnetic recording (SMR) disk media to provide cheap capacity with power down modes that extend the useful life of the infrastructure. Using Black Pearl or other object storage with this technology might provide a decent platform for archive.
But these strategies do not offer the storage capacity that will be required by archives intended to stall the z-pocalypse. Only tape will do the job, says Microsoft and many others. FujiFilm has been making this case for years, and pioneered the tape-cloud model with their Dternity cloud, powered by StrongBox (StrongBox was developed by Crossroads Systems FujiFilm's technical partner).
Dternity is out to provide archive capabilities for the rest of us who want either a standard file system or object storage solution to store all the data that has gone to sleep in our infrastructure. Whether you are Hollywood or the NSA or just a typical storage manager for a commercial business, a tape cloud solution provides a simple way to leverage tape technology.
FreeMyStorage.com is another Dternity innovation. Actually, it is a site where you can download a free tool to run on your file storage infrastructure to identify, in easy-to-read color charts, which files haven't been touched in 30, 60 or 90 days. That way you can model the potential impact of implementing an archive solution. No data from this analysis leaves your premises, so you can use the data in complete confidentiality to determine the best approach to establish an archive and migrate data to it.
Of course, Dternity has some other tools for actually establishing policy-based data movement, and they offer tape platforms for storing archival data and a gateway appliance that can direct your archive both to local archival storage and to the Dternity tape-based cloud. They were among the first to seize on LTFS and to leverage tape as an extension to file system storage, a choice that has paid off in spades. Not only can they provide seamless integration between your primary storage and tape or tape cloud file systems, they also integrate directly with VMware and with NetApp filers.
On the VMware side, Crossroads has announced (in conjunction with Dternity) a version of their gateway appliance (called StrongBox VM) that operates as a virtual machine (VM). So, archiving files or entire VMs that are rarely used but important to keep on low cost tape or a cloud service is easy-peasy. Crossroads also has several software packages for automating data migration based on metadata and user criteria.
On the NetApp front, Crossroads was among the first companies to take advantage of FPOLICY, which provides a back door for use in migrating older data stored on a filer into archival storage. Basically, when you use the StrongBox with NetApp, you gain unlimited storage behind the NetApp head. Comparable technology can be provided behind Windows File Servers as well.
Bottom line: courtesy of innovators like Dternity, Crossroads, Spectra Logic, Caringo and IBM, we may yet prevail against the z-pocalypse. But the time to start preparing is now.
Jon Toigo is a 30-year veteran of IT, and the Managing Partner of Toigo Partners International, an IT industry watchdog and consumer advocacy. He is also the chairman of the Data Management Institute, which focuses on the development of data management as a professional discipline. Toigo has written 15 books on business and IT and published more than 3,000 articles in the technology trade press. He is currently working on several book projects, including The Infrastruggle (for which this blog is named) which he is developing as a blook.