Storage Silos 101: What They Are and How To Manage Them
Trevor Pott addresses the issues surrounding storage silos.
If you attempt to research enterprise storage options today, you'll find a great deal of marketing about "storage silos." So just what exactly is a storage silo, why are silos supposedly a problem and what can be done to address them?
A storage silo is any storage that's isolated from the rest of your organization's storage. This is mostly a terrible analogy, because silos contain useful data; they don't arbitrarily separate one thing from another. Because the basic concept of a storage silo is so broad, there are a number of examples.
Let's say that I have two USB keys, each with important data that I require. If one is at home and one is at work, then clearly my storage has been siloed: I do not have access to both the items at the same time, at the same place.
Take both of those USB keys and plug them into a computer, and the storage is no longer siloed. I can move data freely between the storage devices; I can take the data off the storage devices and put onto my computer's primary storage, upload it to the cloud or do whatever I wish.
If I hold both USB keys in my hand, but do not plug them into a computer, is it still a silo? This is a philosophical question, and one I don't have a ready answer to. What I can do is examine this grey area between what is obviously siloed storage and what is obviously not.
If we replace our two USB keys with a SAN and a NAS, at what point does the interconnectivity between them overcome the fact that these are two separate physical devices? And, ultimately, is it useful to even think about storage as being siloed?
Multiple Management Interfaces
If you have a SAN and a NAS, and both live on the same corporate network, in theory you could exchange data between the two. There's the problem that the SAN is block-based and exposes LUNs, while the NAS is file-based and exposes file shares, but in theory it is possible for the SAN to talk to the NAS and vice versa.
Enabling this communication between the two devices may require a third-party piece of software. Alternately, one or both of these devices may contain an appropriate data exchange mechanism native to the unit's software. In any case, NASes and SANs do not speak the same language, and each will have its own distinct management interface. So why is this important?
Let's consider the aforementioned USB sticks. If I plug two USB sticks into my computer, then both of them are managed by a single interface provided by the OS. This interface natively provides the ability to move data back and forth between the two devices. The interface essentially treats each USB stick as if it was equal to any other storage that system had access to; it treats all storage like a commodity.
This unified management interface is an important concept. In order to illustrate why, let's examine virtualization, an IT market closely interrelated with storage in the modern datacenter.
In theory I could set up an unlimited number of individual servers running VMware's free version of ESXi. Without vCenter, each of these ESXi hosts is an island: the hosts are perfectly functional, and they can run VMs. They can address storage to which they are attached. Moving data between hosts, however, requires turning the VM off, exporting the VM from one host's storage to third-party storage, and then uploading that to another host. This is both inefficient and time consuming.
vCenter, on the other hand, provides a single management interface for all hosts. It allows these hosts to use shared storage. This enables the migration of workloads non-disruptively.
Managing five VMs across two isolated ESXi hosts is not generally a big problem. Just as managing a NAS and a SAN within an organization is not generally a problem. Make this 10 hosts -- or five NASes and five SANs -- and managing these devices is moving past annoying, and into job-impacting frustration. Get into hundreds of hosts or storage systems with thousands of workloads, and management at scale is generally considered impossible without a unified management solution such as vCenter.
Unfortunately, while vCenter is great for virtualization, it's a consumer of storage, not really a metastorage manager in the same way that it is a metaworkload manager. vCenter cannot adequately manage SANs, NASes or storage other than its own vSAN hyperconverged solution.
Organizations trying to manage their compute resources today can turn to VMware. Unified storage management, however, is still something that's unaddressed by organizations of all sizes.
Difficulty Migrating Data Between Silos
Why is VMware such a great example of unified compute management? What makes -- or made -- VMware special? Personally, I believe that it was vMotion that almost singlehandedly turned VMware from an interesting science project into one of the most critical software companies in the world.
vMotion lets administrators move a workload from one host to the other, with minimum effort. Later versions even made it possible to move workloads with zero impact to the running workload. This was neat, but happened well after vMotion had catapulted VMware onto a trajectory toward becoming a tech titan. It was the convenience that VMware brought to workload management that really made it attractive, even to those individuals who were opposed to the idea of running more than one workload per physical server. Never underestimate the value of convenience.
Imagine being able to consume multiple storage devices as equals, whether that storage was local to a server, was a NAS, was a SAN or was in the cloud. Imagine being able to consume these multiple storage devices invisibly, as if each of them were part of a single solution.
Some vendors offer such solutions. Most of the vendors offering these solutions require you to throw away all of your existing storage and commit to that vendor as the only vendor that will provide you storage. While some of these single-vendor approaches to the storage management problem can be useful, vendor monocolutre is often more trouble than it's worth in IT.
When VMware rose to power, it did not say, "We can do all of these neat things that can make your x86 applications behave sort of like they were on a mainframe, but only if you buy VMware storage, VMware servers and so on." What VMware did was allow organizations to use their existing servers or to buy servers from almost anyone they chose.
VMware turned those servers -- and, consequently, the server manufacturers -- into commodities. In other words, it broke down the silos that existed between servers. VMware didn't break down the silos between the individual workloads, but rather the silos between the lifecycle management of workloads on those servers.
Many organizations need a similar solution for storage today. Something that allows data to move seamlessly between devices, allows devices to be added or removed as needed without disrupting an organization's storage, and which can integrate with or replace an organization's disaster recovery solution.
The analogy that links VMware's management of VMs to storage management falls apart when we talk about the maturity of the markets at the time that interconnected management entered the scene. When VMware began commoditizing servers, workload lifecycle management solutions were pretty primitive.
Best-of-breed solutions used scripted installs and imaging, but this was still a highly manual, tedious and long process. There wasn't a lot about how workload management was being done that organizations were particularly interested in preserving. This is not true of storage.
Storage has any number of enterprise features that are the very reason we buy various storage solutions. Data efficiency such as deduplication and compression is one such feature. Storage pooling, tiering between multiple classes of storage, snapshotting, cloning, and many more features are all important and expensive.
Layering a management solution over the top of your storage isn't particularly helpful if it means you have to give up the storage features for which you paid so much money. On the other hand, if your proposed meta-storage management solution brings these enterprise features to all storage regardless of native support, this significantly enhances the value of your total storage footprint without reducing the value of any individual storage unit.
The storage solution that we all want would allow us to manage all of our storage from a single interface. It would move data seamlessly between storage devices while ensuring that the appropriate number of data copies was being kept. Perhaps most important, it would provide a complete suite of enterprise storage features, regardless of the storage device being used.
Next-generation storage fabrics matching this criteria are just now beginning to see mainstream adoption. There are multiple vendors on the market today, each offering their own take on this sort of meta-storage management. Only time will tell if any of them manage to catapult themselves into becoming the true VMware of storage.
Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley startups better understand systems administrators and how to sell to them.